Skip to content

Use a common config map to remove duplicates #41

@kimoonkim

Description

@kimoonkim

We have multiple helm charts, a namenode chart, datanode chart, etc. Each chart populates its own Hadoop core-site.xml and hdfs-site.xml. This occupies a significant part of chart yaml files. And there are a lot of duplicates. We should factor out those config keys in a single shared configmap, say hdfs-config, then let each chart simply mount the configmap in place.

Our charts use a 3rd part Hadoop docker image from uhopper. The docker image has an entrypoint.sh file that already supports custom config files:

if [ -n "$HADOOP_CUSTOM_CONF_DIR" ]; then
    if [ -d "$HADOOP_CUSTOM_CONF_DIR" ]; then
        for f in `ls $HADOOP_CUSTOM_CONF_DIR/`; do
            echo "Applying custom Hadoop configuration file: $f"
            ln -sfn "$HADOOP_CUSTOM_CONF_DIR/$f" "/etc/hadoop/$f"
        done
    else
        echo >&2 "Hadoop custom configuration directory not found or not a directory. Ignoring: $HADOOP_CUSTOM_CONF_DIR"
    fi
fi

And here is the uniq'd list of current config keys as of #39:

$ grep -e CORE_CONF -e HDFS_CONF */*/*.yaml | cut -d':' -f3 | sort | uniq
 CORE_CONF_fs_defaultFS
 CORE_CONF_ha_zookeeper_quorum
 CORE_CONF_hadoop_rpc_protection
 CORE_CONF_hadoop_security_authentication
 CORE_CONF_hadoop_security_authorization
 HDFS_CONF_dfs_block_access_token_enable
 HDFS_CONF_dfs_client_failover_proxy_provider_hdfs___k8s
 HDFS_CONF_dfs_datanode_address
 HDFS_CONF_dfs_datanode_data_dir
 HDFS_CONF_dfs_datanode_http_address
 HDFS_CONF_dfs_datanode_kerberos_principal
 HDFS_CONF_dfs_datanode_kerberos_https_principal
 HDFS_CONF_dfs_datanode_keytab_file
 HDFS_CONF_dfs_encrypt_data_transfer
 HDFS_CONF_dfs_ha_automatic___failover_enabled
 HDFS_CONF_dfs_ha_fencing_methods
 HDFS_CONF_dfs_ha_namenodes_hdfs___k8s
 HDFS_CONF_dfs_journalnode_edits_dir
 HDFS_CONF_dfs_journalnode_kerberos_internal_spnego_principal
 HDFS_CONF_dfs_journalnode_kerberos_principal
 HDFS_CONF_dfs_journalnode_keytab_file
 HDFS_CONF_dfs_namenode_datanode_registration_ip___hostname___check
 HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn0
 HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn1
 HDFS_CONF_dfs_namenode_kerberos_https_principal
 HDFS_CONF_dfs_namenode_kerberos_principal
 HDFS_CONF_dfs_namenode_keytab_file
 HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn0
 HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn1
 HDFS_CONF_dfs_namenode_shared_edits_dir
 HDFS_CONF_dfs_nameservices
 HDFS_CONF_dfs_web_authentication_kerberos_principal

Here is the full non-uniq'd list including values (grep -e CORE_CONF -e HDFS_CONF -A 1 */*/*.yaml):

hdfs-client/templates/client-deployment.yaml:            - name: CORE_CONF_hadoop_security_authentication
hdfs-client/templates/client-deployment.yaml-              value: kerberos
--
hdfs-client/templates/client-deployment.yaml:            - name: CORE_CONF_hadoop_security_authorization
hdfs-client/templates/client-deployment.yaml-              value: "true"
--
hdfs-client/templates/client-deployment.yaml:            - name: CORE_CONF_hadoop_rpc_protection
hdfs-client/templates/client-deployment.yaml-              value: privacy
--
hdfs-client/templates/client-deployment.yaml:            - name: CORE_CONF_fs_defaultFS
hdfs-client/templates/client-deployment.yaml-              value: hdfs://hdfs-k8s
--
hdfs-client/templates/client-deployment.yaml:            - name: HDFS_CONF_dfs_nameservices
hdfs-client/templates/client-deployment.yaml-              value: hdfs-k8s
--
hdfs-client/templates/client-deployment.yaml:            - name: HDFS_CONF_dfs_ha_namenodes_hdfs___k8s
hdfs-client/templates/client-deployment.yaml-              value: nn0,nn1
--
hdfs-client/templates/client-deployment.yaml:            - name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn0
hdfs-client/templates/client-deployment.yaml-              value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
--
hdfs-client/templates/client-deployment.yaml:            - name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn1
hdfs-client/templates/client-deployment.yaml-              value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:8020
--
hdfs-client/templates/client-deployment.yaml:            - name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn0
hdfs-client/templates/client-deployment.yaml-              value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:50070
--
hdfs-client/templates/client-deployment.yaml:            - name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn1
hdfs-client/templates/client-deployment.yaml-              value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:50070
--
hdfs-client/templates/client-deployment.yaml:            - name: HDFS_CONF_dfs_client_failover_proxy_provider_hdfs___k8s
hdfs-client/templates/client-deployment.yaml-              value: org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
--
hdfs-client/templates/client-deployment.yaml:            - name: CORE_CONF_fs_defaultFS
hdfs-client/templates/client-deployment.yaml-              value: hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: CORE_CONF_hadoop_security_authentication
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: kerberos
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: CORE_CONF_hadoop_security_authorization
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: "true"
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: CORE_CONF_hadoop_rpc_protection
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: privacy
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_block_access_token_enable
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: "true"
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_encrypt_data_transfer
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: "true"
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_datanode_kerberos_principal
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: hdfs/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name:  HDFS_CONF_dfs_datanode_kerberos_https_principal
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: HTTP/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_web_authentication_kerberos_principal
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: HTTP/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_datanode_keytab_file
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: /etc/security/hdfs.keytab
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_datanode_address
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: 0.0.0.0:1004
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_datanode_http_address
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: 0.0.0.0:1006
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: CORE_CONF_fs_defaultFS
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: hdfs://hdfs-k8s
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_nameservices
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: hdfs-k8s
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_ha_namenodes_hdfs___k8s
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: nn0,nn1
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn0
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn1
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:8020
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn0
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:50070
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn1
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:50070
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: CORE_CONF_fs_defaultFS
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
--
hdfs-datanode-k8s/templates/datanode-daemonset.yaml:            - name: HDFS_CONF_dfs_datanode_data_dir
hdfs-datanode-k8s/templates/datanode-daemonset.yaml-              value: |-
--
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml:            - name: CORE_CONF_hadoop_security_authentication
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml-              value: kerberos
--
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml:            - name: CORE_CONF_hadoop_security_authorization
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml-              value: "true"
--
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml:            - name: CORE_CONF_hadoop_rpc_protection
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml-              value: privacy
--
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml:            - name: HDFS_CONF_dfs_journalnode_kerberos_principal
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml-              value: hdfs/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml:            - name: HDFS_CONF_dfs_journalnode_kerberos_internal_spnego_principal
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml-              value: HTTP/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml:            - name: HDFS_CONF_dfs_journalnode_keytab_file
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml-              value: /etc/security/hdfs.keytab
--
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml:            - name: HDFS_CONF_dfs_journalnode_edits_dir
hdfs-journalnode-k8s/templates/journalnode-statefulset.yaml-              value: /hadoop/dfs/journal
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: CORE_CONF_hadoop_security_authentication
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: kerberos
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: CORE_CONF_hadoop_security_authorization
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: "true"
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: CORE_CONF_hadoop_rpc_protection
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: privacy
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_block_access_token_enable
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: "true"
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_encrypt_data_transfer
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: "true"
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_kerberos_principal
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: hdfs/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_kerberos_https_principal
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: HTTP/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_web_authentication_kerberos_principal
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: HTTP/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_keytab_file
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: /etc/security/hdfs.keytab
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_journalnode_kerberos_principal
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: hdfs/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_journalnode_kerberos_internal_spnego_principal
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: HTTP/_HOST@{{ required "A valid kerberosRealm entry required!" .Values.kerberosRealm }}
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: CORE_CONF_fs_defaultFS
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: hdfs://hdfs-k8s
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: CORE_CONF_ha_zookeeper_quorum
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: {{ .Values.zookeeperQuorum }}
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_nameservices
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: hdfs-k8s
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_ha_namenodes_hdfs___k8s
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: nn0,nn1
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn0
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn1
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:8020
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn0
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:50070
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn1
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:50070
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_shared_edits_dir
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: qjournal://{{ .Values.journalQuorum }}/hdfs-k8s
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_ha_automatic___failover_enabled
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: "true"
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_ha_fencing_methods
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: "shell(/bin/true)"
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_client_failover_proxy_provider_hdfs___k8s
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
--
hdfs-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_datanode_registration_ip___hostname___check
hdfs-namenode-k8s/templates/namenode-statefulset.yaml-              value: "false"
--
hdfs-simple-namenode-k8s/templates/namenode-statefulset.yaml:            - name: HDFS_CONF_dfs_namenode_datanode_registration_ip___hostname___check
hdfs-simple-namenode-k8s/templates/namenode-statefulset.yaml-              value: "false"

Many values are static. Some values are dynamic but I believe all dynamic values can be determined from user supplied parameters. This means we should introduce a new helm chart for populating the config map, which is ok especially since we are heading toward a uber-chart.

A few notes:

  • the entrypoint.sh also seems to populate some config keys internally. Make sure to look into them.
  • It seems we currently rely on default values for some config keys. For instance, dfs.namenode.name.dir. We may want to specify them explicitly for the configmap.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions