-
Notifications
You must be signed in to change notification settings - Fork 5
Made few updates and changes #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cm_service_restart.py
Outdated
| stale_config_only = select_from_items('Restart roles on nodes with stale config? Default is false', options) # Needs to be a user selection | ||
|
|
||
| # rr_command = service_rr.rolling_restart(batch_size, fail_count_threshold, None, stale_config_only, None, None, None).wait() | ||
| # return rr_command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented for now since I need to confirm which service can be given a shot before executing this.
Here I am getting few inputs from the user to set the parameters as per their wish. The ones I have passed as None will be defaulted to whatever they are.
Still a test needs to be done on this. But once this piece is merged, dont wanna take risk to lose any code . lol
cm_service_restart.py
Outdated
| import os | ||
| import sys | ||
|
|
||
| import smtplib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont see this being used anywhere..? i assume you're planning to send update emails
|
can you rename the script to |
cm_service_restart.py
Outdated
|
|
||
| elif action == defined_actions[2].lower() and ('hue' in service.name.lower() or 'hive' in service.name.lower() or \ | ||
| 'oozie' in service.name.lower() or 'spark' in service.name.lower()): # RR unavailable for these | ||
| print "Rolling Restart is unavailable in " + service.name.upper() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i would rather just call it and then catch the exception - i think that's more pythonic.
edit: actually yea do this then you get rid of the if here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct and agree with that . I am making the script bit descriptive right now and once the functionalities work as expected , we can put the exception handling and reduce any extra lines of code.
|
Renamed to cm_service_manager.py |
cm_service_restart.py
Outdated
| elif action == defined_actions[1].lower(): # Deploy client configs | ||
| command = service.deploy_client_config().wait() | ||
|
|
||
| elif action == defined_actions[2].lower() and ('hue' in service.name.lower() or 'hive' in service.name.lower() or \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id rather you get rid of the if here with checking the service name against a list like i mentioned below https://github.com/atheiman/cm-automation-scripts-devops-hadoop/pull/2/files#r127274598
but for future reference i think this would work better
elif action == defined_actions[2].lower() and service_name in ['hue', 'hive', 'spark', 'oozie']
print "Rolling Restart is unavailable in " + service_name
# ....
elif action == defined_actions[2].lower() and service_name in ['hdfs', 'mapreduce', 'hbase', 'yarn', 'zookeeper']
print "Rolling Restart is unavailable in " + service_name
cm_service_manager.py
Outdated
| elif action == defined_actions[1].lower(): # Deploy client configs | ||
| command = service.deploy_client_config().wait() | ||
|
|
||
| elif action == defined_actions[2].lower() and ('hue' in service.name.lower() or 'hive' in service.name.lower() or \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still rather just call it and then catch the exception - its more pythonic and it handles the services that you havent listed here. you shouldnt be listing out all these services like this.
you can even make it so that if rolling restart is selected and not available then you just do a restart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually that does make sense. Cool, I will see what I can work out on these.
I was also feeling weird to code all these services explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simple example trying to illustrate what im describing
service = 'hive'
try:
print "executing rolling restart on service '%s'" % service
service.rolling_lower() # think of this as rolling_restart()
except AttributeError:
print "rolling restart not available on service '%s'" % service
print "executing restart on service '%s'" % service
service.lower() # think of this as restart()~ $ ~/tmp/test.py
executing rolling restart on service 'hive'
rolling restart not available on service 'hive'
executing restart on service 'hive'| print "WARNING: Rolling Restart is not available for '%s' " % service.name.lower() | ||
| print "Executing Restart on service '%s' " % service.name.lower() | ||
| rr_command = service.restart().wait() | ||
| return rr_command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the try block.
But one thing that I think is messing up would be : when service = 'hive/oozie/hue', then the batch_size etc would still be asked to input from the user, which would be unnecessary and which I do not want.
So i was thinking , inside of try block, can a single 'if' statement be used to verify the service and raise exception if service is one those? (I have commented out in line 46-47). This way if the service doesn't support RR, then user would not need to unnecessarily input those details and except block will be executed.
| server.quit() | ||
| except: | ||
| print "Invalid email information. Please check SMTP server info or mailing address" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mail block added. Mails are being sent successfully but not the the DL.
I still need to modify and generalize this part. Currently its a bit hard coded.
|
I guess I wasn't clear on this I still want the user to choose deploy client config, rolling restart, or restart. The logic I'm describing to you is for falling back to restart when a user selects rolling restart but it's not available. You're right, a user shouldn't select rolling restart for a non-distributed service. I was suggesting this try catch logic because it is a better approach than the checking you were doing for "can this service do a rolling restart". So yes, have the user select restart / rolling restart / deploy client config. then if it is rolling restart, If you want to leave that out I'm fine with it. But definitely don't define the list of services that are eligible for certain actions. Actually - probably the very best solution would be to get the list of actions available to the service the user chooses, then offer up the actions available on that service to the user (this is how the cm ui does it). |
cm_service_manager.py
Outdated
| server = smtplib.SMTP(server, 25) # SMTP server at port 25 | ||
|
|
||
| from_email = '[email protected]' | ||
| server.sendmail(from_email, '[email protected], [email protected]', message) # Send mail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any configuration in this repo needs to be, well, configurable.
This is for use outside of just Cerner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, i have to make it dynamic. the user mail id and receiver id both will be asked at runtime.
|
Regarding the try block. I have applies those changes already :) I had made these changes today, but had only one thing in my mind, that in case the user selects Hue (or oozie etc) , then the try block will definitely execute till the actual api method call, which includes the user inputs for number of nodes in batch, stale config etc. |
|
@atheiman , need review on this sir |
|
diff: https://gist.github.com/atheiman/590b50ddd4e37eae80a2ddf388e49bf8 edit: other way |
atheiman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think our back and forth has been a little slow (mostly because ive been super busy).
Im gonna rebase on these changes and make a few adjustments and then push that rather than merging this guy. youll still have the commit here tho.
| stale_config_only = select_from_items('option to Restart roles on nodes with stale config?(Default is false) \n', options) # Needs to be a user selection | ||
|
|
||
| rr_command = service_rr.rolling_restart(batch_size, fail_count_threshold, None, stale_config_only, None, None, None).wait() | ||
| except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont blindly catch exceptions - if youre gonna have a try / except narrow down the exception you are looking for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, agreed. I tried the attribute exception, but that didn't work tho.
Need to find out the exact one
| try: | ||
| print "Deploying Client Configurations on service '%s'... \n" % service.name.lower() | ||
| command = service.deploy_client_config().wait() | ||
| except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing here, handle specific exceptions
| rr_command = service_rr.rolling_restart(batch_size, fail_count_threshold, None, stale_config_only, None, None, None).wait() | ||
| except: | ||
| print "WARNING: Rolling Restart is not available for '%s'. Exiting... " % service.name.lower() | ||
| print "Executing Restart on service '%s' " % service.name.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets not fall back to hard restart from rolling restart. we'll just error out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error for what @atheiman ? Didn't get that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right below this you do service.restart()
id rather just call service.rolling_restart() and the api call will fail (error out)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But wasn't that the plan? Where the services which do not have RR functionality, will be just restarted over here.
I am coming to this , since the user is selecting that particular service and aims to cycle it.
So instead, can it be like follows:
User selects a service which doesn't support RR, so it will error out in the try block.
When it goes to the except block, that time we give a prompt "Instead do a Normal restart? Yes or No"
The user has the option here only to continue or not. Instead of starting from the beginning.
| options = ["true", "false"] # options for stale config | ||
| stale_config_only = select_from_items('option to Restart roles on nodes with stale config?(Default is false) \n', options) # Needs to be a user selection | ||
|
|
||
| rr_command = service_rr.rolling_restart(batch_size, fail_count_threshold, None, stale_config_only, None, None, None).wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i want to look at the doc to see what all these args are. can you link me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fail_count_threshold = raw_input("Stop rolling restart if more than this number of worker batches fail \n") | ||
|
|
||
| options = ["true", "false"] # options for stale config | ||
| stale_config_only = select_from_items('option to Restart roles on nodes with stale config?(Default is false) \n', options) # Needs to be a user selection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this wording is a bit misleading... maybe:
Restart roles only on instances with stale config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took from CM, but yea this looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh so i am misinterpretting it then i think. true is yes do include stale config instances, false is restart on every instance except those with stale config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from the doc:
@param stale_configs_only: Restart roles with stale configs only. Default is false.
https://github.com/cloudera/cm_api/blob/master/python/src/cm_api/endpoints/services.py#L1458
so i was understanding it correctly. we'll want to prompt with Restart roles with stale configs only? yes (true) or no (false)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup correct.
Option yes(true) will restart the roles only on the nodes having stale configs
Option No(false), which is default, will restart the roles on all the nodes.
|
Im not having time to look at this RN, but I'll merge because @mamgainanoop confirmed he has used this to rolling restart. |
@atheiman
I made a few changes to the code. Took out the command block out of the if block .
Added few more conditions to the elif statements because few functionalities do not exist for few services.