Skip to content

Conversation

@raju-saravanan
Copy link
Contributor

@raju-saravanan raju-saravanan commented Jul 15, 2021

Changes in this PR:

  • Plugin support to create private load balancers for DW cluster.
  • Add support for the creation of DW database catalogs based on DW definitions.
  • Add support creation of DW virtual warehouses based on DW definitions.

Changes in this PR are done in conjunction with the following PRs:

@raju-saravanan raju-saravanan force-pushed the cdw-level-2 branch 2 times, most recently from 8217872 to b290255 Compare July 22, 2021 13:18
Copy link
Contributor

@Chaffelson Chaffelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. there's a minor hiccup with DW teardown but that is a bug already fixed in the purge PR, which is also due to be merged.

@raju-saravanan raju-saravanan force-pushed the cdw-level-2 branch 3 times, most recently from a4f0c27 to 8bd5738 Compare July 27, 2021 17:18
Copy link
Member

@wmudge wmudge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review. I'm going to continue testing and review, esp. with the changes/updates in the underlying cloudera.cloud.dw_* modules.

name: "{{ item.1.name | default([run__namespace, run__dw_vw_suffix ,__dw_dbc_index] | join('-')) }}"
dbc_name: "{{ item.0.name }}"
use_default_dbc: "{{ item.0.use_default_dbc }}"
type: "{{ item.1.type | default('hive') }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a variable set at the dw: global key level (with defaults)

dbc_name: "{{ item.0.name }}"
use_default_dbc: "{{ item.0.use_default_dbc }}"
type: "{{ item.1.type | default('hive') }}"
template: "{{ item.1.template | default('xsmall') }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a variable set at the dw: global key level (with defaults)

use_default_dbc: "{{ item.0.use_default_dbc }}"
type: "{{ item.1.type | default('hive') }}"
template: "{{ item.1.template | default('xsmall') }}"
autoscaling_min_nodes: "{{ item.1.autoscaling.min_nodes | default(None) }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use omit() than pass None or an empty config dict for lines 194-199

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am referencing these parameters in setup_aws file. So, if I use omit() here I would have to use omit() in setup_aws as well. If you are ok with this, I can make the changes

loop: "{{ run__dw_dbc_configs | subelements('virtual_warehouses')}}"
loop_control:
index_var: __dw_dbc_index
label: "{{ item.0.name }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also have a loop_var to avoid using item

run__dw_dbc_configs: "{{ run__dw_dbc_configs | default([]) | union([config]) }}"
vars:
include: "{{ lookup('template', __dw_config.include | default('experiences_config_placeholder.j2')) | from_yaml }}"
use_default_dbc: "{{ True if __dw_config.name is not defined else False | bool }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be inlined at line 177, also __dw_config.name is undefined | bool works?

- name: Assign AWS Private Subnet IDs in not assinged
when: run__datahub_private_subnet_ids is undefined
ansible.builtin.set_fact:
run__datahub_private_subnet_ids: "{{ [] }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run__datahub_private_subnet_ids: "{{ run__datahub_private_subnet_ids | default([]) }}" and remove the when

- name: Create CDP DW Database catalogs
when: not __dw_dbc_config.use_default_dbc
cloudera.cloud.dw_database_catalog:
cluster_id : "{{ run__dw_list.clusters[0].id }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps pull out into a separate task that can ensure the cluster 0 exists, etc. -- the set_fact variable could be __default_dbc etc.

- name: Set CDP DW Database catalog name to id map
when: __dw_dbc_build_async.database_catalogs is defined
ansible.builtin.set_fact:
run__dw_dbc_ids: "{{ run__dw_dbc_ids | default({}) | combine({ __dw_dbc_build_async.database_catalogs[0].name : __dw_dbc_build_async.database_catalogs[0].id}) }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output of dw_database_catalog now returns a dict directly, not a list.

- name: Create CDP DW Virtual warehouse
cloudera.cloud.dw_virtual_warehouse:
cluster_id: "{{ run__dw_list.clusters[0].id }}"
dbc_id: "{{ run__dw_dbc_ids[__dw_vw_config.dbc_name] if not __dw_vw_config.use_default_dbc else run__dw_list.clusters[0].dbcs[0].id}}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be streamlined via the __default_dbc etc., also could use ternary as well

type: "{{ __dw_vw_config.type }}"
name: "{{ __dw_vw_config.name }}"
template: "{{ __dw_vw_config.template }}"
autoscaling_min_nodes: "{{ __dw_vw_config.autoscaling_min_nodes | int }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the omit() that would be set in initialize task list should carry through here to the module call.

@Chaffelson Chaffelson self-requested a review August 23, 2021 21:25
@Chaffelson
Copy link
Contributor

Create CDP DW Virtual warehouse is currently busted, and teardown is also not working. Please revise.

@raju-saravanan raju-saravanan force-pushed the cdw-level-2 branch 2 times, most recently from f98c32d to ea4323d Compare August 30, 2021 12:15
@raju-saravanan
Copy link
Contributor Author

@Chaffelson: Rebased the PR and made changes to make it work with webster's PR for Cloudera cloud.

Copy link
Contributor

@Chaffelson Chaffelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor change

@raju-saravanan
Copy link
Contributor Author

@wmudge : Can we get this PR merged ? Is there anything else we are waiting for ?

@Chaffelson
Copy link
Contributor

@wmudge : Can we get this PR merged ? Is there anything else we are waiting for ?

Not to my knowledge, provided the .cloud and cdpy dependencies are also clear.

@wmudge
Copy link
Member

wmudge commented Sep 13, 2021

@raju-saravanan Just wanted to get the other changes into main before tackling this. I will review again.

@wmudge wmudge added the enhancement MINOR - New feature or enhancement in the CHANGELOG label Sep 13, 2021
Signed-off-by: Saravanan Raju <[email protected]>
Signed-off-by: Saravanan Raju <[email protected]>
Signed-off-by: Saravanan Raju <[email protected]>
@raju-saravanan
Copy link
Contributor Author

@wmudge: Added support for partial teardown. The user would have to declare 'dw.teardown' tag to delete a subset of database_catalogs or virtual_warehouse (this will not delete the cdw cluster itself), if this is not declared we will do a cascade delete of the entire cdw cluster. eg:

dw: 
   teardown:
      database_catalogs:
        - name: "sraju-custom-dbc-2"
        - id: "warehouse-1632239880-b896"
      virtual_warehouses:
        - id: "compute-1632311757-7b8g"
        - name: "srx-6-vw-1"
        - name: "srx-6-vw-0"
        - name: "srx-6-vw-4"

You can mention vw and dbc to be deleted with its id or name as shown above. Found a small bug in dw_database_catalog.py raised a pr for this here (cloudera-labs/cloudera.cloud#33), which needs to be merged before this PR.

@Chaffelson
Copy link
Contributor

@wmudge Should this be closed as it is superseded by #57 ?

@wmudge wmudge marked this pull request as draft April 1, 2022 16:29
@Chaffelson
Copy link
Contributor

Replaced by #102

@Chaffelson Chaffelson closed this Nov 2, 2022
@wmudge wmudge added skip_changelog SKIP - Do not include in the CHANGELOG and removed enhancement MINOR - New feature or enhancement in the CHANGELOG labels Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip_changelog SKIP - Do not include in the CHANGELOG

Development

Successfully merging this pull request may close these issues.

4 participants