What is AWSCDK?
With the AWS CDK, you can build reliable, scalable, and cost-effective applications in the cloud with the remarkable expressive power of a programming language.
I think the AWS CDK has revolutionized the way we create resources in the cloud.
AWS CDK allows developers to write infrastructure as code and feel right at home.
It is such a flexible and powerful tool that it is easy to make mistakes.
Some mistakes are very costly.
A few weeks ago I asked developers on LinkedIn and Twitter why they don't use AWS CDK (besides multicloud) and got lots of answers and questions; Thank you for sharing your experiences!
I see myself as a supporter of AWS CDK; I have given lectures on CDK internally at my workplace, organized aWeb seminar, and first gave a lectureAWS CDK-Tag 2020.
So I decided the best way to address these concerns and convert people to CDK was to:
1. Write my config to a CDKBest practice guidecompiled from almost three years of development with CDK.
2. Provide one that worksCDK Template Projectthat implements these practices that can be found Here.
AWS CDK Best Practices
These guidelines expand the officialAWS CDK Best Practices.
Please note that this blog post assumes that you have a basic understanding of the AWS CDK and understand stack concepts and constructs.
These guidelines fall into several categories:
Project Structure and Backlog Guidelines
resilience and security
General Development Tips
Project Structure and Backlog Guidelines
These guidelines discuss a recommended project structure.
One repository, one application, one stack
“Every application starts with a single stack in a single repository”: AWS CDK Best Practices Guide
In accordance with official AWS CDK best practices, your code repository should not contain more than one CDK application, a stack of one or more builds maintained by a team.
I agree with this practice.
Adding more stacks and applications to the repository increases deployment time and the explosion radius in case of errors.
Use numerous builds instead of multiple stacks.
However, how do you know when to split a stack into multiple stacks?
My general rule of thumb is to share if:
The new stack is another service or microservice with additional requirements or deployment strategies (multi-region deployment, etc.).
Another team will manage the new batch.
The new stack has a different business domain.
The implementation has become too long or complex on the stack. Divide the stack into microservices that span multiple repositories. The dependency/contract between the microservices is defined with REST API/SQS/EventBridge/SNS etc.
AWS SaaS Service Repository Structure
The AWS CDK code and the AWS Lambda drivers for the service are in the same repository but in different folders as separate Python modules.
The project consists of two internal projects: the infrastructure as CDK code project and the service code (AWS Lambda controller code with tests).
These two internal projects are separated to reduce the blast radius in case of failure and improve the testability of each project.
Since I am developing serverless services, the example describes a serverless project. However, the structure remains the same for all serverless services.
The most important project folders are:
CDK folder: in the cdk folder. A CDK application provides a stack consisting of one or more CDK constructs. CDK constructs contain AWS resources and their connections/relationships. Read more about the details of the CDK project here and here.
Documents folder: GitHub site service documentation folder (optional).
service order: AWS Lambda project files: controllers, input/output schematics, and utility folders. Each controller has its own schematic folder. The utilities folder stored shared code across multiple controllers.
test folder: Unit/Integration Testing and E2E. Separating the service code from the infrastructure makes it easier to run tests locally in the IDE.
You can find my serverless template projectHereand the blog about itHere.
Create template projects
To accelerate serverless adoption and CDK consumption, you can create a template project that deploys a new service and team with a working CDK deployable project with a functional pipeline and AWS Lambda controllers that include all best practices.
As an architect on a cloud platformgroup, we understood the importance of sharing builds and providing guides for CDK guides, including a template project.
You can find my serverless template projectHereand the blog about itHere.
This section contains best practices for developing builds.
Use constructs to model the infrastructure domains of your application.
Strive to keep stack-defined unconstructed resources to a minimum.
If two constructs have the same resource dependency, either:
Put the resource in a new build and provide it as an input parameter to both
Decide which build is entitled to "own" it and build internally Provide it as an input parameter to the other build.
Enterprise Domain Driven Builds
I don't think there is a right or wrong approach to choosing resources in builds as long as it makes sense to you and you can easily find resources and settings.
However, I think it makes more sense to take a business domain based approach to choosing which resources belong together in a build.
It's easier to find resources and understand their connections just by looking at the service architecture diagram. It's also easier to share design patterns between teams in organizations that may need the same architecture.
Let's say I have a serverless "Reservation" service containing two business domains:
Provide a REST API for CRUD actions for clients to use a booking engine.
Send reservation data across multiple internal services (share and send reservation information in an asynchronous method from service to service.(Video) AWS re:Invent 2022 - AWS infrastructure as code: A year in review (DOP201)
Let's say I designed the following architecture:
reservation service architecture
GW API with CRUD AWS Lambda functions.
AWS Aurora database used by the CRUD functions for read/write queries.
AWS Aurora Streams triggers an AWS Lambda function that sends data to other external services through an AWS SNS topic.
AWS SNS — A central Pub/Sub implementation across all services. Allows both the "Reservation" service and external services to send each other messages on the subject of the reservation.
AWS SQS is subscribed to the SNS topic and a SQS target AWS Lambda function. This Lambda function receives a request event from the SNS and queries the Aurora database and sends a response via the SNS.
Note: The AWS Aurora database is accessed through Lambda functions and the SQS/streams CRUD API. However, CRUD usage is read/write while SQS/Streams lambdas are read-only.
How do we create the constructs?
You would model the service on two enterprise domain constructs:CRUDYNews.
reserve service constructs
HeCRUDConstruct instantiates the following resources:
AWS Lambda functions and their roles.
Aurorabuild— AWS Aurora is relatively complex to define in the CDK (VPC, databases, permissions, etc.) and must be handled in its own build, instantiated within the CRUD build.
AWS SQS with a subscription to the AWS SNS topic.
AWS Lambda function and role enabled by AWS SQS. The Lambda function has read-only access to the AWS Aurora database and can send events to AWS SNS.
AWS Lambda function powered by Aurora Streams that sends booking data to external services through the AWS SNS topic.
The messaging construct also receives the Aurora construct as an input parameter to define the permissions and settings required for AWS Lambda functions.
Rule of thumb: Any shared resource shared by all builds (Bad-Letter Queue SQS or Error Handling Mechanism) should be defined as a separate build and passed as input to the other builds.
Create repositories for the CDK layout libraries to be used as project dependencies across the enterprise. These builds are safe and proven resources or even patterns.
You may have a cloud engineering or DevOps team creating such compatible builds. You can read cloud platform engineeringHere.
WAF rules for AWS API Gateway/CloudFront distributions.
SNS -> SQS activation pattern where SNS and SQS have encryption at rest defined with the required CMKs.
AWS AppConfig dynamic configuration construct.
This section describes guidelines for successful stack implementation, environment modeling, and developer experience related to local development and testing.
“Development teams should be able to use their own accounts to test and provision new resources in those accounts as needed. Individual developers can treat these resources as extensions of their own development workstation” —AWS CDK Guide
Several options come to mind; Choose the one that makes the most sense for you given your company size, budget, and AWS account administration costs.
Be sure to use the guidelines and best practices for managing these accounts with AWS Organizations and AWS ControlTower.
Smaller companies use strategies like account per developer, where you can deploy the ENTIRE product line and its myriad of microservices in one account to develop locally, make changes, push and test immediately with local stable releases.
The experience in local development is excellent.
Integration testing is easier and can be done locally. However, mimicking actual service data on behalf of AWS can be challenging and reduce the practicality of the test.
Observability in many accounts is difficult to maintain.
Management of a large number of accounts considering the small size of the company.
When developers open a PR, the pipeline is deployed to a different account. Counting per level is the recommended approach, as it reduces the blast radius on breaches or errors.
However, larger companies may take a different approach.
Sometimes it is impossible to deploy the entire enterprise portfolio in a single account due to the complexity, resource quotas, and knowledge required to deliver the services.
A viable option for large companies consists of the following:
Different service teams get their separate AWS accounts, which reduces friction, conflicts, and the chances of hitting AWS service quotas per AWS account.
Each stage of the CI/CD pipeline gets its AWS account (development, test, staging, production, etc.).
Developers develop together in the development account. They deploy their servicing CDK stacks to the same account. However, each stack has a different name/prefix to eliminate possible implementation conflicts.
Developers test with real services implemented in other accounts.
Integration testing across multiple accounts can sometimes require cross-account trust and can be more challenging.
As you can see, each method has its pros and cons, and there is no magic bullet.
Model your CI/CD pipeline stages in code
Use local configuration files and environment variables along with code conditions ("if statements") to provide different resource configurations for different accounts and stages.
A development account might have different policies (for example, keep deletion policy), but a production account practices a robust configuration.
Each stage has different secrets and API keys - reduce the blast radius when an account is breached.
The production account defines a higher AWS Lambda reserved concurrency or enables provisioned concurrency to eliminate cold starts.
Different billing models for AWS DynamoDB at different levels. Reduce costs on developer accounts and maximize performance in production.
Resilience and Security Policies
Define retention policies
In a production environment, it is important to set eviction policies for stateful or critical resources (databases) to "Keep" so that you do not delete and recreate the data they contain.
In the development environment, it's okay to clear resources as soon as the stack is deleted or the resource id changes.
This practice will reduce your costs and wasted resources in the long run.
AWS CDK tests
Use AWS CDK probes to ensure that critical resources are not deleted and that connections to critical resources remain intact.
These tests are done at synth time, so there's no way I'm going to provide a buggy version.
See examples of infrastructure tests in my CDK templateHere.
The tests verify that the API GW resource exists at synthesizer time before implementation.
View more examplesHere.
Security standards are not good enough
Always strive for the least privileged permissions and roles. Each AWS Lambda must have its own role with its minimum subset of permissions.
There are many security policies for different resources. Sometimes they are reflected in the AWS CDK defaults, e.g. B. S3 blocks public access by default. But sometimes not.
It is up to you to always keep safety in mind:
Should this resource be in a VPC? Do I need to set IAM authorization for this API gateway route? Maybe make it a private API gateway?
Fortunately, there are tools you can use to cover your tracks. You should use tools likeCFN-Nry CDK-nag.
Visit theCDK-NörgelTests I implemented in my CDK template projectHere.
Secrets in the CDK
Don't store secrets in your code, whether it's a CDK or an AWS Lambda driver. Use AWS SSM or AWS Secrets Manager encrypted strings to protect them.
However, how do you pass the secrets to the CDK so that they are deployed to AWS Secrets Manager?
One viable option is to store the secret as a GitHub/Jenkins/CI secret and inject it as a CDK parameter or environment variable during the deployment process.
Always remember resilience
Avoid changing the logical ID of stateful resources.
“Avoid changing the logical ID of stateful resources. Changing the logical ID of the resource will result in the resource being replaced with a new one on the next deployment. The logical ID is derived from the ID you provide when creating an instance of the construct.” — Official best practices for the AWS CDK.
It is always better to be safe than sorry. Errors can happen.
It takes a naive refactoring of a stateful CDK resource (moving builds, renaming logical IDs, etc.) to wipe an entire production database and customer data.
Create backups for your stateful resources; Use built-in backups, such as DynamoDB's point-in-time restore, or use AWS Backup to create custom backups.
You can use a tool called 'Notify CDK' to add visibility to your pipeline in any PR.
Now you can visually understand changes to your infrastructure before every PR merger. The green ones are resources that have been added and the red ones are resources that have been removed.
Conscious developers can now visually tell when a PR disaster is coming.
Thank you so much,Roy Ben Josef,for this valuable advice!
General Development Tips
If you use built-in CDK builds and resources, read the documentation. Make sure you understand each parameter and its definition before you start working.
These optional parameters often contain security or resource policies that you must define explicitly.
Write your own IAM policies
A powerful capability of CDK builds is the large number of built-in IAM-related functions that abstract the actual IAM policies. Example: A DynamoDB table build can grant read/write permissions to a role ('grant_read_write_data“).
While this abstracts the IAM policies that are added to the Beneficiary role, this is the side effect.
Developers don't understand IAM policies and what their changes grant to the role. Yes, you can read the documentation for the function, but many developers don't go the extra mile.
Some of these features add permissions that your role doesn't need, violating the security principle of least privilege.
In the example, the AWS AppConfig role defines the permissions to get the configuration. It is easy to write and very readable.
Conclusion: Define your own AWS IAM policy document. You can define inline policies or full JSON IAM policies. If possible, try to limit the Resources section to a specific resource: a specific bucket, a specific AWS DynamoDB table, etc.
While less intuitive, you learn about IAM and build better secure services, and policy becomes more precise and less abstract in the long run.
watch thisthe blogfor more examples.
Read more about itHere.
When developing CFN — first try to use the console, understand the relationships between the elements, and then try to create the CFN objects and connect them.
don't abstract too much
Writing AWS CDK is writing code. As engineers, we thrive on and pride ourselves on cool abstractions and tricks. However, as with anything else in life, don't overdo it, especially with infrastructure code. I would accept some minor code duplication — The AWS Lambda function definition with clear definition and configuration of environment variables via a complicated factory method that creates multiple Lambda functions.
Infrastructure code is critical, so it should be as readable and organized as possible. However, that doesn't mean code duplication is always acceptable. Find the right balance between abstraction and readability for you.
Graciasalon sadowskii for help with this post.
How do I organize my CDK files? ›
CDK applications should be organized into logical units, such as API, database, and monitoring resources, and optionally have a pipeline for automated deployments.When not to use CDK? ›
- Where an application is deployed infrequently / is already “in maintenance mode”
- Where an application-development team is a separate team from the one responsible for production deployment.
AWS CDK is a very famous way of defining your deployment with IaC. AWS CDK synthesize to AWS Cloudformation which looks back on many years of use in divers productions. So yes AWS CDK is production ready!Is AWS CDK better than CloudFormation? ›
Additionally, the CDK provides a more structured reuse format than CloudFormation. The three-tiered reuse level of components, intents, and patterns means you can build up a library of reusable components and patterns your entire organization can use to build infrastructure and ship applications more quickly.Is Cdk better than Terraform? ›
As powerful and mature the CDK may be, it is limited only to AWS Cloud. When considering the scope of the IAC tools, Terraform is the obvious winner of the two. It makes a lot of sense to have your developers use a single tool for all the cloud platforms.