Blog anatomy - Infrastructure provisioning overview
Overview
CDK is an interface between you and CloudFormation
Simply speaking, CDK is just a CloudFormation generator, but wait ... that's a good thing.
CloudFormation is powerful, just not pleasant to use, in my opinion. I find CloudFormation hard to read, long-winded, non-extendable, arbitrarily structured, and I dislike the verbosity when using functions, especially when chaining them.
I haven't worked in IDE that would support (read as "support well") crafting CloudFormation. That gets even more important as your CloudFormation template repository grows and you’re trying to perform reliable refactoring, reference search, etc. Very important feature if you aren't the only person working on it.
CDK generating Cloudformation mitigates the above.
In addition, it gives us the possibility to debug the process mid-stage (template) when something goes wrong. It allows people who know CloudFormation to learn CDK faster by comparing CloudFormation generated from their CDK code with their expectations.
CDK supports multiple languages 🥳
At the time I was really looking forward to use Python or Go (maybe even Java) to define IaC. I'm fine with HCL (used in Terraform) apart from rare situations when I needed that slightly more imperative style and definitive control over the flow. HCL can achieve a lot, but sometimes we use it to hammer a square peg into a round hole.
It soon transpired that the core of CDK requires Node.js even if you don't want to define your stack in Javascript or Typescript. It is also helpful to use npm in this ecosystem and, before you say "bloat" three times, the node_modules appear 😡. Since the damage was already done, I just went with the flow and picked Typescript.
I dislike the forced dependency on JS runtime and tooling. I do have to say, however, that the process of learning CDK (on Typescript) and building the stack code was quite simple and much more pleasant than crafting pure CloudFormation. Before you erupt with joy please bear in mind that I think many things are much more pleasant than that.
I recall two distinct challenges that I’ve encountered, as follows.
Sharing state/references across region/account/stack in CDK code 🤮
Please note:
I didn't come across StackSets in CDK in 2021, when I was porting from Terraform.
While they were present in CloudFormation at the time I suspect they may have been introduced in CDK slightly later.
I may be wrong here.
I suspect StackSets would require less boilerplate code from me.
My code was primarily working around cross-account and cross-region state sharing constraints.
It also orchestrated resource creation in specific order to satisfy the dependency chain between them.
Both concerns may have been addressed already in StackSets.
Cross-region sharing of state/references in CDK code was awkward
When writing my first CDK code, I realized I just couldn't freely define the resources as I see fit (like I could in Terraform). All my resources were initially crammed into one Stack as I wanted to get something running before I improved on it. I was quickly able to create S3 bucket and IAM roles and policies and was really upbeat about it. Once I started defining resources in other regions, my luck run out. A Stack is tied into a specific region and various of my resources needed to be placed in specific regions, which was often dictated by AWS architecture:
- Route53 hosted zones are global (Northern Virginia region)
- CloudFront distributions (Northern Virginia region)
- Certificate Manager certificates (can use other regions but not if CloudFront is serving that certificate)
- before the introduction of CloudFront functions, I used Lambda@Edge (which also had to be deployed in Northern Virginia)
other constructs were "created" in the region of my choosing.
Please note:
Talking about region when creating IAM (and all other services) via CDK means not necessarily that the resource is
regional. It means the CDK will use that region to create CloudFormation Stack. CloudFormation Stack (from the
perspective of CDK at least) is kind of internal resource (a helper) that is tied into the lifecycle of the actual
resource. It's a bit like having a UI to manage remotely stored Terraform state.
IAM, as an example, is global while S3 is a curveball as it is regional, but buckets are visible across regions in the console, and they got to have unique names across the universe even. When creating CDK resources however, they'll be synthesized into CloudFormation stacks which require a region to "live".
After the naive setup where my Stack defined all resources, I split the set, grouped by region and defined Stacks tied to that region.
That also didn't work very well for me, because of the cyclic dependencies between resources/regions. To illustrate this I've had:
- CloudFront distribution required origins which are S3 buckets (crossing region)
- S3 grants rights to OriginAccessIdentity which is part of CloudFront API (crossing region)
- OriginAccessIdentity is created with a role which doesn’t exist until IAM primitives are created (flow control)
So I decided to break away the resources that are least dependent and provision them early, following on with the next batch which is only dependent on the previous ones. The resources which were provisioned at the same stage and belonged to the same region were grouped.
I ended-up with this:
That was better, but I still have a challenge to overcome.
I was unable to simply share handles of resources or other, more dynamic results across stacks in different regions. It was possible but required SSM parameters or other hoop jumping accompanied by circus music in the background.
I was genuinely disheartened as I was using a higher order programming language (Typescript) but I couldn't simply pass a reference across regions
Because provisioning of this blog always happened on a single machine I opted-in for a simple state sharing solution; local filesystem. So I did write the Stack's exported outputs into a file. Before running the next stage it would be ingested, values of interest extracted, and passed as a parameter to the next stage.
Please note:
I am aware of crossRegionReferences: true
which simplifies this greatly.
I’ve started using that after the most recent tech stack update in 2024.
I believe this wasn't available at the time I was learning CDK in 2021.
I was able to limit my reliance on the local filesystem thanks to that.
Cross-account sharing in CDK would be equally awkward
I was confident I wanted to have separate environments for the blog.
I'm trying to avoid mix-up of resources co-existing in production, test or other environments or any of my other projects overlapping the management account, for that matter. If a credential leaks, that environment can be shutdown. If there's been some careless changes or destructive experimentation going on that environment can be reinstated. It also serves as a nice dividing line when looking at the expenditure.
I leveraged AWS Organizations to define environments (maintenance/production/test/etc.).
Passing cross-account references was expected to be cumbersome. Since I’ve already used a local filesystem to share cross-region state, it turned out to fit nicely with it.
Please note: Again, cross-account concerns are said to be addressed in StackSets (as is cross-region sharing).
Repeated MFA prompts 😫
The following snippets should help you understand my setup:
snippet of my AWS credentials
file
1[seb]
2aws_access_key_id = <REDACTED>
3aws_secret_access_key = <REDACTED>
4
5[...]
and one of my AWS config
file
1[default]
2region = eu-west-2
3output = json
4
5[profile blog-devops-test]
6source_profile = seb
7mfa_serial = arn:aws:iam::<REDACTED PARENT ACC>:mfa/seb
8role_arn = arn:aws:iam::<REDACTED TEST ACC>:role/blog-devops
9role_session_name = devops-of-test-blog
10
11[profile blog-author-test]
12source_profile = seb
13mfa_serial = arn:aws:iam::<REDACTED PARENT ACC>:mfa/seb
14role_arn = arn:aws:iam::<REDACTED TEST ACC>:role/blog-author
15role_session_name = author-of-test-blog
16
17[...]
Now then, there's something very fundamental that feels unfinished in CDK.
Executing a series of AWS CDK commands with MFA
1cdk diff --profile blog-devops-test BlogStorageStack
2MFA token for arn:aws:iam::<REDACTED PARENT ACC>:mfa/seb:
3XXXXXX
4
5[... output...]
6
7# mere seconds later, exactly the same command (and profile) and it will ask me for MFA again
8
9cdk diff --profile blog-devops-test BlogStorageStack
10MFA token for arn:aws:iam::<REDACTED PARENT ACC>:mfa/seb:
11# wait until new token is generated (every 60s)
12YYYYYY
13
14[... output...]
As you can see above, it is very choppy when I use an MFA-protected user with CDK. Even invocations within seconds of each other and using the same profile will prompt for MFA repeatedly. I know of people who stopped using MFA to be able to experience a smoother workflow or be able to automate it. I think there are third party tools that can help but is that really the way we have to handle credentials?
Why wouldn't the session caching match what AWS CLI does? (sorry for the spoiler).
Executing a series of AWS CLI commands with MFA (as a contrast)
1aws s3 ls s3://some-bucket --profile blog-devops-test
2Enter MFA code for arn:aws:iam::<REDACTED PARENT ACC>:mfa/seb:
3XXXXXX
4
5[... output...]
6
7# significantly later
8
9aws s3 ls s3://some-bucket --profile blog-devops-test
10
11# notice no need for MFA entry
12[... output...]
13
14# way later after the very first invocation using this profile (session expired)
15
16aws s3 ls s3://some-bucket --profile blog-devops-test
17Enter MFA code for arn:aws:iam::<REDACTED PARENT ACC>:mfa/seb:
18YYYYYY
19
20[... output...]
Once the MFA is entered and correct, the session with an assumed role is cached. Later commands using the same profile don't require authentication until the session expires (configurable duration between 1 and 12 hours).
MFA aspect summary
While using CDK, I keep getting prompted for MFA even though the invocations may be in close succession (problem #1).
AWS CLI works perfectly when the profile has already been cached and has not expired.
In my opinion, AWS CLI lacks, however, in the following scenario (problem #2):
1# CAREFUL!: the following mv invocation will rename the default location of AWS CLI cache
2# to demonstrate a point (you will have to reauthenticate). When done just run
3# mv ~/.aws/cli/cache{.bkp/*,} && rmdir ~/.aws/cli/cache.bkp
4
5#mv ~/.aws/cli/cache{,.bkp}
6
7aws s3 ls s3://some-bucket --profile blog-author-test
8Enter MFA code for arn:aws:iam::<REDACTED PARENT ACC>:mfa/seb:
9XXXXXX
10
11[... output...]
12
13# mere seconds later, notice using different profile
14
15aws s3 ls s3://some-bucket --profile blog-devops-test
16Enter MFA code for arn:aws:iam::<REDACTED PARENT ACC>:mfa/seb:
17YYYYYY
18
19[... output...]
20
21# after this step both profiles are cached and can be used without MFA prompt until session expiry
The top commands are going to emulate clear AWS CLI cache by renaming the cache directory. I invoke AWS CLI twice, each time with a different profile. Each profile assumes a different role, but both profiles refer to the same source profile (and MFA device, as per configuration). The MFA prompt will keep popping once per profile, after which it is cached (until it expires again).
I don't understand why the process is designed like that.
Surely, the point of MFA is increasing certainty the user is who they said they are. In the above scenario the user is the same for both invocations since the source profile is the same. Could that fact not be cached instead/alongside the session with the assumed role? I understand that caching an assumed role session is less risky than caching MFA's user session, but I would still love to be able to cache it for some configurable (probably short) time.
The concept of caching sessions is always going to be a compromise between security and convenience.
Wouldn't this, however, improve the user experience and stop users from switching off security features or compromising the security of their accounts by the introduction of other tools? 🤔
Whatever the evolution of this process is, I'm thrilled I don't have to appear in-person in the AWS office every time I want to authenticate. Now I said it ...
Other posts in blog-anatomy series
- Introduction
- Site generation overview
- Site generation (upgrade 2024)
- Managing cloud resources
- Cloud services
- Infrastructure provisioning overview
- Infrastructure provisioning (upgrade 2024)