January 8, 2025
CDP part 5: user permissions management on CDP Public Cloud

CDP part 5: user permissions management on CDP Public Cloud

When you create a user or a group in CDP, it requires permissions to access resources and use the Data Services.

This article is the fifth in a series of six:

CDP Public Cloud manages these permissions through roles, which control the scope of access to the resources.

There are two main types of roles:

  • Account Roles: permissions to access or perform tasks on all resources within the CDP tenant
  • Resource Roles: permissions to access or perform tasks on a specific resource, such as an environment

This article focus on setting the roles and the Ranger policies required for the group of users created in User management on CDP Public Cloud with Keycloak to complete the lab-article which closes this series.

By definition, a group in CDP is a collection of user accounts that have the same account and resource roles. Therefore we can manage all our needs at the group level.

Three remarks before starting the configuration:

  • At least one user of the group has to log in to make the group visible on the CDP console.
  • The PowerUser role is required to assign roles to a group.
  • The EnvironmentAdmin role is required to set the Ranger Policies.

Required Roles

To give users access to all resources required to set up the lab article, we need to assign them the following roles:

  • Account Roles:
    • PowerUser
    • DFCatalogAdmin
    • DFCatalogViewer
  • Resource Roles:
    • DWAdmin
    • DWUser
    • DFFlowAdmin
    • DFFlowUser
    • DEUser

In addition, we need to set the Data Access Role on the IDBroker Mappings to ensure user applications can access the Data Lake.

As in CDP Public Cloud deployment on AWS, the role configuration can be done via the Cloudera web interface or the CDP CLI. Both approaches are covered.

Configuring Roles using the CDP Web Interface

This approach is recommended if you are new to CDP. It is slower but gives you a better idea of the configuration process. If you did not install and configure the CDP CLI and the AWS CLI as described in Introduction to end-to-end data lakehouse architecture with CDP, this is also your only option.

If you want to go faster and use the terminal to set the roles, scroll down to the Configuring roles from the Terminal section.

Note: You still need to use the CDP console to configure the Ranger policies since this task cannot be accomplished using the CDP CLI.

To set the Account Roles:

  1. Log in to the CDP console and select Management Console




    management_console

  2. Navigate to User Management > Groups > Your group name




    account_roles01

  3. Select Roles and click Update Roles




    account_roles02

  4. Select the account roles of the list above and click Update




    account_roles03

  5. You should get the following




    account_roles04

To set the Resource Roles:

  1. Log in to the CDP console and select Management Console




    management_console

  2. Navigate to Environments > Your environment




    resource_roles01

  3. On the top right corner select Actions and click Manage Access




    resource_roles02

  4. Select the Access tab, write your group name in the search box




    resurce_roles03

  5. Select the resource roles of the list above and click Update




    resource_roles04

  6. The last step is to synchronize the users with the environment, therefore click Synchronize Users




    resource_roles05

  7. Click Synchronize Users




    resource_roles05

To set IDBrokers Mappings:

  1. Select the IDBroker Mappings tab, click Edit




    idbroker_mapping01

  2. Add Data Access Role

    1. Select your group name in the search box
    2. Copy the Data Access Role above
    3. Paste it into the Role space
    4. Click Save and Sync




    idbroker_mapping02

  3. You should get the following




    idbroker_mapping03

Configuring Roles from the Terminal

Deploying via the terminal is recommended for experienced users who want to launch their environment quickly. You need to have the CDP CLI and the AWS CLI installed on your system as described in the CDP part 1: introduction to end-to-end data lakehouse architecture with CDP.

Configuration via the terminal requires the following steps:

  1. Set Account Roles
  2. Set Resource Roles
  3. Set IDBroker mappings
  4. Synchronize users

Set Account Roles

To set the account roles, you need your group name and the CRN of the roles you want to assign. In order to do so, use the following commands:


export CDP_GROUP_NAME=adaltas-students

export ACCOUNT_ROLES=(PowerUser DFCatalogAdmin DFCatalogViewer)


get_crn_account_role () {
   CDP_ACCOUNT_ROLE_NAME=$1
   CDP_ACCOUNT_ROLE_CRN=$(cdp iam list-roles |jq --arg CDP_ACCOUNT_ROLE_NAME "$CDP_ACCOUNT_ROLE_NAME" '.roles[] | select(.crn | endswith($CDP_ACCOUNT_ROLE_NAME))| .crn')
   echo $CDP_ACCOUNT_ROLE_CRN | tr -d '"'
}

With all the required variables defined, you can set the roles.


for role_name in "${ACCOUNT_ROLES[@]}"; do \
cdp iam assign-group-role \
   --group-name ${CDP_GROUP_NAME} \
   --role $(get_crn_account_role ${role_name}); \
done

There is no immediate feedback if you successfully assign the roles. You can validate with this command:

cdp iam list-group-assigned-roles --group-name $CDP_GROUP_NAME

Set Resource Roles

To set resource roles, you need the CRN of your CDP environment, your group of users, and the roles you want to assign. In order to do so, use the following commands:


export CDP_ENV_NAME=[your-environment-name]

export CDP_GROUP_CRN=$(cdp iam list-groups |jq --arg CDP_GROUP_NAME "$CDP_GROUP_NAME" '.groups[] | select(.groupName==$CDP_GROUP_NAME).crn')

export CDP_ENV_CRN=$(cdp environments describe-environment --environment-name ${CDP_ENV_NAME} | jq -r .environment.crn)

export RESOURCE_ROLES=(DWAdmin DWUser DFFlowAdmin DFFlowUser DEUser)

get_crn_resource_role () {
   CDP_RESOURCE_ROLE_NAME=$1
   CDP_RESOURCE_ROLE_CRN=$(cdp iam list-resource-roles |jq --arg CDP_RESOURCE_ROLE_NAME "$CDP_RESOURCE_ROLE_NAME" '.resourceRoles[] | select(.crn | endswith($CDP_RESOURCE_ROLE_NAME))| .crn')
   echo $CDP_RESOURCE_ROLE_CRN | tr -d '"'
}

With all the required variables defined, you can set the roles.


for role_name in "${RESOURCE_ROLES[@]}"; do \
cdp iam assign-group-resource-role \
   --group-name $CDP_GROUP_NAME \
   --resource-role-crn $(get_crn_resource_role ${role_name}) \
   --resource-crn $CDP_ENV_CRN; \
done

There is no immediate feedback if you successfully assign the roles. You can validate with this command:

cdp iam list-group-assigned-resource-roles --group-name $CDP_GROUP_NAME

Set IDBroker mapping

To configure the IDBroker Mapping, you need information from your AWS CloudFormation stack. Retrieve this information using the following commands:


export AWS_ACCOUNT_ID=$(aws sts get-caller-identity | jq .Account)
export CDP_RESOURCE_PREFIX=$(aws cloudformation describe-stacks --stack-name aws-${USER}-env | jq '.Stacks[].Parameters[] | select(.ParameterKey=="prefix").ParameterValue')

export AWS_DATA_ADMIN_ROLE_ARN=arn:aws:iam::${AWS_ACCOUNT_ID}:role/${CDP_RESOURCE_PREFIX}-datalake-admin-role
export AWS_RANGER_AUDIT_ROLE_ARN=arn:aws:iam::${AWS_ACCOUNT_ID}:role/${CDP_RESOURCE_PREFIX}-ranger-audit-role

Now you can set the Data Access Role to your group on the IDBroker Mappings.

Note: The following command updates ALL the IDBroker Mappings configuration, which is why both Data Access and Ranger Audit roles are required.


cdp environments set-id-broker-mappings \
  --environment-name $CDP_ENV_NAME \
  --data-access-role $AWS_DATA_ADMIN_ROLE_ARN \
  --ranger-audit-role $AWS_RANGER_AUDIT_ROLE_ARN \
  --mappings accessorCrn=$CDP_GROUP_CRN,role=$AWS_DATA_ADMIN_ROLE_ARN

Synchronize Users and IDBroker Mappings

With all the configuration done, it’s time to synchronize both users and IDBroker mappings with your environment.


cdp environments sync-all-users \
  --environment-name $CDP_ENV_NAME


cdp environments sync-id-broker-mappings \
  --environment-name $CDP_ENV_NAME

Configure Ranger policies

There is one more layer of permissions to be configured to enable users to complete the lab, Ranger policies.

These policies are at the Data Warehouse service level. As you will see in the next article, users need to create and query tables on the data warehouse.

All this configuration is done via the Cloudera web interface using the Ranger console.

As a reminder, before starting, you need the Power User account role on CDP to follow along.

  1. Navigate to Data Warehouse




    cdp_datawarehouse

  2. In Overview, find the Database Catalog title for your environment, click on the three vertical dots on the top right, and select Open Ranger




    ranger_policies01

  3. In the Ranger Service Manager, click Hadoop SQL




    ranger_policies02

  4. Open policy 9: all – database, table, columns

    • Add {USER} under Allow Conditions, Select Users
    • Click Save




    ranger_policy9_00




    ranger_policy9_01




    ranger_policy9_02




    ranger_policy9_03

  5. Open policy 11: all – storage-type, storage-url

    • Add {USER} under Allow Conditions, Select Users
    • Click Save




    ranger_policy11_00




    ranger_policy11_01




    ranger_policy11_02




    ranger_policy11_03

Next Steps: Hands-On Lab on a CDP Public Cloud Environment

Finally, both users and architecture are ready, so it’s time to let users experiment with all the managed services of your AWS-hosted CDP Public Cloud Environment with the hands-on lab-article that closes this series.

Leave a Reply

Your email address will not be published. Required fields are marked *