diff --git a/.gitignore b/.gitignore index f60797b..e4744fd 100644 --- a/.gitignore +++ b/.gitignore @@ -2,6 +2,9 @@ !jest.config.js *.d.ts node_modules +*/__pycache__/* +lib/*.json +data/*.json # CDK asset staging directory .cdk.staging diff --git a/README.md b/README.md index 2162c0b..176c47c 100644 --- a/README.md +++ b/README.md @@ -1,193 +1,118 @@ -# TAG Based CloudWatch Dashboard using CDK +# Tag Based CloudWatch Dashboard using CDK [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![AWS Provider](https://img.shields.io/badge/provider-AWS-orange?logo=amazon-aws&color=ff9900)](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html) -The project is an example how to use AWS Resource Groups Tagging API to retrieve a specific tag -and then based on found resources pull additional information from respective service APIs to generate -a configuration file (JSON) to build a CloudWatch Dashboard with _reasonable_ metrics and alarms. Optionally customers -can also deploy a central alarm dashboard to monitor alarms across an AWS Organization, AWS Organization OU or across -arbitrary number of AWS accounts. +The project is an example building a CloudWatch Dashboard. It provides users with a set of CloudWatch Dashboard with _reasonable_ metrics and alarms. It gets the list of AWS resources with a specific Tag using AWS Resource Groups Tagging API. Optionally it provides a Central Alarm Dashboard to monitor Alarms across an AWS Organization, AWS Organization OU or across arbitrary number of AWS accounts. -## Features + [![Click to open screenshot](screenshots/EC2-burstable-instance-thumb.png)](#screenshots) [![Click to open screenshot](screenshots/LambdaCompact-thumb.png)](#screenshots) -### Supported services -- Amazon API Gateway v1 (REST) -- Amazon API Gateway v2 (HTTP, WebSockets) -- AWS AppSync -- Amazon Aurora -- Auto Scaling groups -- On-Demand Capacity Reservations -- Amazon CloudFront -- Amazon DynamoDB -- Amazon EBS (as part of EC2) -- Amazon EC2 (support for t\* burstable instances, support for CloudWatch Agent) -- ELB v1 (ELB Classic) -- ELB v2 (ALB, NLB) -- Amazon ECS (EC2 and Fargate) -- Amazon EFS -- AWS Lambda -- AWS Elemental MediaLive -- AWS Elemental MediaPackage -- NAT Gateway -- RDS -- S3 -- SNS -- SQS -- Transit Gateway -- AWS WAFv2 -### Central alarm dashboard features + +## Features +### 1. Metric dashboards +- Discovers AWS resources based on Tags. +- Generates a set of CloudWatch Dashboards for AWS Resources. Dashboards are specifically designed to monitor the most important operational metrics. +- Can build CloudWatch Dashboards for resources in other accounts if CloudWatch cross-account observability is enabled. + +### 2. Alarm dashboard - Event-driven for scalability and speed - Supports arbitrary source accounts within an AWS Organization (different teams can have own dashboards) - Supports automatic source account configuration through stack-sets -- Supports visualization and sorting of alarm priority (CRITICAL, MEDIUM, LOW) through alarm tags in source accounts. -Simply add tag with key `priority` and values critical, medium or low. +- Supports visualization and sorting of alarm priority (CRITICAL, MEDIUM, LOW) through alarm tags in source accounts. Simply add tag with key `priority` and values critical, medium or low. - Supports tag data for EC2 instances in source accounts ## How it works ### Metric dashboards -1. `data/resource_collector.py` is used to call the Resource Groups Tagging API and to generate the configuration file. -2. CDK (v2) is used to generate CloudFormation template and deploy it +1. A python tool `data/main.py` is used to retrieve a list of resources using Resource Groups Tagging API and to generate the configuration file. +2. CDK is used to generate CloudFormation template and deploy it -The solution will create metrics and alarms following best practices. + ![Architecture ](screenshots/Architecture-Dashboards-SingleAccount.png) -### Central alarm dashboard -When a CloudWatch Alarm changes state (going from OK to ALARM state), an event is emitted to EventBridge in the account. -An event bus rule forwards the event to the central event bus in the monitoring account. This event is then registered -in DynamoDB. CloudWatch custom widgets visualize current alarm state on the dashboard. +For MultiAccount use cases this [diagram](screenshots/Architecture-Dashboards-MultiAccount.png). -## Prerequisites -### To generate the resource configuration: +### Central Alarm Dashboard +When a CloudWatch Alarm changes state (going from OK to ALARM state), an event is emitted to EventBridge in the account. An event bus rule forwards the event to the central event bus in the monitoring account. This event is then registered in DynamoDB. CloudWatch custom widgets visualize current alarm state on the dashboard. -- Python 3 -- Boto 3 (Python module. `python -m pip install boto3`) + ![Architecture ](screenshots/Architecture-Alerts.png) -### To generate the dashboard +## Prerequisites +- Python3 - NodeJS 16+, recommended 18LTS, (required by CDK v2) - CDK v2 (Installation: `npm -g install aws-cdk@latest`) -## Configuration properties in lib/config.json - -`BaseName` (String:required) - Base-name of your dashboards. This will be the prefix of the dashboard names. - -`ResourceFile` (String:required) - The path for the file where resources are stored. Used by the `resource_collector.py` -when generating resource config and by the CDK when generating the CF template. - -`TagKey` (String:required) - Configuration of the tag key that will select resources to be included. - -`TagValues` (Array:required) - List of values of `TagKey` to include. - -`Regions` (Array:required) - List of regions from which resources are displayed. - -`GroupingTagKey` (String:optional) - If set, separate Lambda and EC2 dashboards will be created for every value of that -tag. Every value groups resources by that value. - -`CustomEC2TagKeys` (Array:optional) - If set, the tag info will show in the EC2 header widget in format -Key:Value. Useful to add auxilary information to the header. - -`CustomNamepsaceFile` (String:required) - Detected custom namespaces. Not yet used. - -`Compact` (boolean (true/false):required) - When set to true, multiple Lambda functions will be put in a single widget -set. Useful when there are many Lambda functions. - -`CompactMaxResourcesPerWidget` (Integer:required) - When `Compact` is set to true, determines how many Lambda functions -will be in each widget set. - -`AlarmTopic` (String:optional) - When `AlarmTopic` contains a string with an ARN to a SNS topic, all alarms will be -created with an action to send notification to that SNS topic. - -`AlarmDashboard.enabled` (boolean (true/false):optional) - When set to true deploys the alarm dashboard in the account. - -`AlarmDashboard.organizationId` (String: required when `AlarmDashboard.enabled` is true) - Required in order to set -resource policy on the custom event bus to allow PutEvents from the AWS Organization. +## Installation and Run deployment of CloudWatch Dashboards -`MetricDashboards.enabled` (boolean (true/false):optional) - If not defined or set to true, deploy metric dashboards. -Recommended if only alarm dashboard is being deployed. +1. Check out the project, install dependencies and initiate CDK + ```bash + git checkout git@github.com:aws-samples/tag-based-cloudwatch-dashboard.git + cd tag-based-cloudwatch-dashboard + pip3 install -r requirements.txt + npm ic + cdk bootstrap + ``` +2. Run + ```bash + python3 data/main.py + ``` -## Getting and preparing the code + The tool will read existing configuration file, then read resources from your AWS Account in given regions and generate CloudWatch Dashboards for all resources with a given Tag. You can fine tune Dashboards using Configuration File and repeating the command above. -1. Check out the project. -2. Change current directory to project directory. (`cd tag-based-cloudwatch-dashboard`) -3. If deploying for the first time, run `cdk bootstrap` to bootstrap the environment -(https://docs.aws.amazon.com/cdk/v2/guide/bootstrapping.html). In case you don't want to bootstrap -read [Deploying without boostraping CDK](BOOTSTRAP.md). -4. Run `npm install` to install dependencies. +3. Explore Dashboards in your [CloudWatch Console](https://console.aws.amazon.com/cloudwatch/home?#dashboards/). -## Configuring the dashboards -1. Open the configuration file `lib/config.json` in your editor of choice. -2. Set TagKey to tag key you want to use and TagValues to an array of values. Dashboard will collect all resources tagged -with that key and the specified values. -3. Set Regions to include the regions that contain resources you want to monitor. -4. **OPTIONAL** if you want to deploy central alarm dashboard set `AlarmDashboard.enabled` to true and provide your AWS -Organizations id in `AlarmDashboard.organizationId`. -5. **OPTIONAL** if you don't want to use metric dashboards you can disable creation of those by setting -`MetricDashboards.enabled` to false. See _Configuration properties in lib/config.json_ above for more information. -6. Save the configuration file. +## Enabling Alarms Feature +Enabling of alarms feature requires modifications on 2 accounts: source account and destination account. -## Deploying the dashboards +1. In addition to CloudWatch Dashboards dashboards deployment, modify in `lib/config.json` by setting `AlarmDashboard.enabled` to `True` and provide your AWS-Organizations id in `AlarmDashboard.organizationId` (o-xxxxx, not ou-xxxx). Then update dashboards using the tool: + ```bash + python3 data/main.py + ``` + Once CDK deploy is done, identify in the output of the Stack `AlarmDashboardStack` and note values of output `CustomEventBusArn` and `CustomDynamoDBFunctionRoleArn`. -1. If the deployment of the metric dashboards have been enabled, run `cd data; python3 resource_collector.py` to create -the resource configuration file (`resources.json` in the `data` directory). -2. **OPTIONAL:** Edit `BaseName`-property in `lib/config.json` to change the name of your dashboard. In case you plan to -deploy multiple sets of dashboards for different applications in the same account, ensure all subsequent deploys have -different `BaseName`. -3. Run `cd ..` to change directory to project root. -4. Run `cdk synth` to generate CF template or use `cdk deploy --all` to deploy directly to your AWS account. -5. In case central alarm dashboard is enabled in the configuration, take note of deployment output, -`*.CustomEventBusArn` and `*.CustomDynamoDBFunctionRoleArn` and copy those ARNs to use in the next stage. +2. Deploy [`event_forwarder.yaml`](/stack_sets/event_forwarder_template.yaml) template manually to each of the source accounts and each region you wish to enable through CloudFormation or deploy it automatically to an AWS Organization, OU or list of accounts through service managed StackSets from your management account or StackSet delegate account. Use values of outputs `CustomEventBusArn` and `CustomDynamoDBFunctionRoleArn` as input parameters of the Stack/StackSets. -## Enabling source accounts to share alarms -_This only applies in case `AlarmDashboard.enabled` is set_ + If you are using StackSets please note that StackSets are not deploying Stacks in the Management Account. If you want Alerts from Management account you will need additionally to deploy [`event_forwarder.yaml`](/stack_sets/event_forwarder_template.yaml) in all relevant regions of your Management Account. -1. Run command `cd stack_sets` to change directory which contains `event_forwarder_template.yaml`. -2. Run command `sh create_stackset.sh ARN_OF_CUSTOM_EVENT_BUS ARN_OF_THE_LAMBDA_FUNCTION_ROLE_ARN`, replace the -placeholder with the ARNs from the previous step. -3. Deploy the generated `event_forwarder.yaml`-template manually to each of the source accounts and each region you wish -to enable through CloudFormation or deploy it automatically to an AWS Organization, OU or list of accounts through -service managed stack-sets from your management account or stack-set delegate account. +## Advanced configuration +You can fine tune configuration of dashboards in by editing a configuration file `lib/config.json` -## Monitoring alarms in "Management Account" +* `BaseName` (String:required) - Base-name of your dashboards. This will be the prefix of the dashboard names. +* `ResourceFile` (String:required) - The path for the file where resources are stored. Used by the `resource_collector.py` when generating resource config and by the CDK when generating the CF template. +* `TagKey` (String:required) - Configuration of the tag key that will select resources to be included. +* `TagValues` (Array:required) - List of values of `TagKey` to include. +* `Regions` (Array:required) - List of regions from which resources are displayed. +* `GroupingTagKey` (String:optional) - If set, separate Lambda and EC2 dashboards will be created for every value of that tag. Every value groups resources by that value. +* `CustomEC2TagKeys` (Array:optional) - If set, the tag info will show in the EC2 header widget in format Key:Value. Useful to add auxilary information to the header. +* `CustomNamepsaceFile` (String:required) - Detected custom namespaces. Not yet used. +* `Compact` (boolean (true/false):required) - When set to true, multiple Lambda functions will be put in a single widget set. Useful when there are many Lambda functions. +* `CompactMaxResourcesPerWidget` (Integer:required) - When `Compact` is set to true, determines how many Lambda functions will be in each widget set. +* `AlarmTopic` (String:optional) - When `AlarmTopic` contains a string with an ARN to a SNS topic, all alarms will be created with an action to send notification to that SNS topic. +* `AlarmDashboard.enabled` (boolean (true/false):optional) - When set to true deploys the alarm dashboard in the account. +* `AlarmDashboard.organizationId` (String: required when `AlarmDashboard.enabled` is true) - Required in order to set resource policy on the custom event bus to allow PutEvents from the AWS Organization. +* `MetricDashboards.enabled` (boolean (true/false):optional) - If not defined or set to true, deploy metric dashboards. Recommended if only alarm dashboard is being deployed. -In case you have alarms in the AWS Organizations management account but are deploying the Alarm Dashboard in another -account, you will need to manually deploy `event_forwarder.yaml` in the management account in all regions that you want -to receive alarms from. This is because of that even if the `event_forwarder.yaml` is deployed as a managed stack set it -won't get deployed in the management account. - -## Tips - -Try setting up a CodeCommit repository where you store your code. Set up a CI/CD pipeline to automatically redeploy your dashboard. -This way, if you want to change/add/remove any metrics for any of the services you change the code, commit it, and it will be automatically deployed. - -Try creating an EventBridge rule that will listen to specific tag change and trigger the CodeBuild project to redeploy the dashboard. -This way, if you have an autoscaling group or just tag additional resources the dashboard will deploy automatically. In case you do so, monitor your builds -to avoid rare situations where a lot of tag changes could cause excessive amounts of concurrent or queued builds (for example event bridge rule misconfiguration or -variable loads that causes ASG to scale up and down quickly). This can be done by specifying tag value in the Event Bridge rule or instead of triggering the build -directly from Event Bridge sending it to a Lambda function for more flexible decision-making on whether to trigger a build or not. ## Screenshots -> Click on the thumbnails to see the full res screenshot - > Note that all blue labels in the headers (text widgets) are links that will take you to the respective resource in the console for quick access. ### Lambda in "compact" mode - Number of Lambda functions per widget is controlled through `CompactMaxResourcesPerWidget` parameter in `lib/config.json` - [![Click to open screenshot](screenshots/LambdaCompact-thumb.png)](screenshots/LambdaCompact.png) + ![screenshots/LambdaCompact](screenshots/LambdaCompact.png) ### EC2 Instance - Individual EBS volumes are presented with additional volume information (type and IOPS) - PIO volumes are presented with additional metrics - [![Click to open screenshot](screenshots/EC2-instance-thumb.png)](screenshots/EC2-instance.png) + ![screenshots/EC2-instance](screenshots/EC2-instance.png) ### Burstable EC2 Instance with CloudWatch agent configured @@ -195,22 +120,58 @@ directly from Event Bridge sending it to a Lambda function for more flexible dec - Additional metrics to keep track of CPU-credits usage are shown - If CloudWatch agent is configured then the widgets are shown automatically - [![Click to open screenshot](screenshots/EC2-burstable-instance-thumb.png)](screenshots/EC2-burstable-instance.png) + ![screenshots/EC2-burstable](screenshots/EC2-burstable-instance.png)] ### Network dashboard - TGW view - Metrics are shown on TGW and on attachment level - Type of attachment is shown - [![Click to open screenshot](screenshots/Network-TGW-thumb.png)](screenshots/Network-TGW.png) + ![screenshots/Network-TGW](screenshots/Network-TGW.png) ### ECS with EC2 service - Cluster level and service level metrics are shown separately - If service is EC2-type then high level metrics for EC2 instances are shown - [![Click to open screenshot](screenshots/ECS-EC2-service-thumb.png)](screenshots/ECS-EC2-service.png) + ![screenshots/ECS-EC2-service](screenshots/ECS-EC2-service.png) + -### Developing +## Supported services +- Amazon API Gateway v1 (REST) +- Amazon API Gateway v2 (HTTP, WebSockets) +- AWS AppSync +- Amazon Aurora +- Auto Scaling groups +- On-Demand Capacity Reservations +- Amazon CloudFront +- Amazon DynamoDB +- Amazon EBS (as part of EC2) +- Amazon EC2 (support for t\* burstable instances, support for CloudWatch Agent) +- ELB v1 (ELB Classic) +- ELB v2 (ALB, NLB) +- Amazon ECS (EC2 and Fargate) +- Amazon EFS +- AWS Lambda +- AWS Elemental MediaLive +- AWS Elemental MediaPackage +- NAT Gateway +- RDS +- S3 +- SNS +- SQS +- Transit Gateway +- AWS WAFv2 + +## Tips +You can setting up a CodeCommit repository where you store your code. Set up a CI/CD pipeline to automatically redeploy your dashboard. +This way, if you want to change/add/remove any metrics for any of the services you change the code, commit it, and it will be automatically deployed. + +You can also create an EventBridge rule that will listen to specific tag change and trigger the CodeBuild project to redeploy the dashboard. +This way, if you have an autoscaling group or just tag additional resources the dashboard will deploy automatically. In case you do so, monitor your builds +to avoid rare situations where a lot of tag changes could cause excessive amounts of concurrent or queued builds (for example event bridge rule misconfiguration or +variable loads that causes ASG to scale up and down quickly). This can be done by specifying tag value in the Event Bridge rule or instead of triggering the build +directly from Event Bridge sending it to a Lambda function for more flexible decision-making on whether to trigger a build or not. +## Developing [Developing](DEVELOPING.md) diff --git a/data/main.py b/data/main.py new file mode 100644 index 0000000..14ec512 --- /dev/null +++ b/data/main.py @@ -0,0 +1,183 @@ +import os +import json +import time +import datetime + +import click +import boto3 +from tqdm import tqdm +from InquirerPy import inquirer +from InquirerPy.base.control import Choice + +from resource_collector import get_config, get_resources, router + +class App(): + def __init__(self, profile=None): + self.session = boto3.session.Session(profile_name=profile) + + def get_active_regions(self, days=30, threshold=1): + """ Retrieve from Cost Explorer the list of regions where spend is over threshold + """ + client = self.session.client('ce') + end = datetime.datetime.utcnow().date() + start = end - datetime.timedelta(days=days) + response = client.get_cost_and_usage( + TimePeriod={ + 'Start': start.strftime('%Y-%m-%d'), + 'End': end.strftime('%Y-%m-%d') + }, + Granularity='DAILY', + Metrics=['UnblendedCost'], + GroupBy=[ + { + 'Type': 'DIMENSION', + 'Key': 'SERVICE' + }, + { + 'Type': 'DIMENSION', + 'Key': 'REGION' + } + ] + ) + region_spend = {} + for result in response['ResultsByTime']: + for group in result['Groups']: + if len(group['Keys']) > 1: + region = group['Keys'][1] + service = group['Keys'][0] + amount = group['Metrics']['UnblendedCost']['Amount'] + if region == 'global': + continue + if region not in region_spend: + region_spend[region] = 0 + region_spend[region] += float(amount) + return [region for region, amount in region_spend.items() if amount > threshold] + + def get_regions(self, default=None): + ec2_client = self.session.client('ec2') + all_regions = [region['RegionName'] for region in ec2_client.describe_regions()['Regions']] + if not default: + try: + default = self.get_active_regions() + ['us-east-1'] + except: + default = ['us-east-1'] + return inquirer.checkbox( + message="Select regions:", + choices=sorted([Choice(value=name, enabled=name in default) for name in all_regions], key=lambda x: str(int(x.enabled)) + x.value, reverse=True), + cycle=False, + ).execute() + + def get_tag_key(self, default=None): + resource_tagging_api = self.session.client('resourcegroupstaggingapi') + tag_keys = list(resource_tagging_api.get_paginator('get_tag_keys').paginate().search('TagKeys')) + return inquirer.fuzzy( + message="Select Tag Key:", + choices=[Choice(value=name) for name in tag_keys], + default=default, + ).execute() + + def get_tag_values(self, key, default=None): + resource_tagging_api = self.session.client('resourcegroupstaggingapi') + tag_values = list(resource_tagging_api.get_paginator('get_tag_values').paginate(Key=key).search('TagValues')) + default = default or [] + return inquirer.checkbox( + message=f"Select Tag {key} Value :", + choices=[Choice(value=name, enabled=name in default) for name in tag_values], + cycle=False, + ).execute() + + def account_id(self): + return self.session.client('sts').get_caller_identity()['Account'] + + +@click.command() +@click.option('--regions', default=None, help='Comma Separated list of regions') +@click.option('--tag', default=None, help='a Tag name') +@click.option('--values', default=None, help='Comma Separated list of values') +@click.option('--config-file', default="lib/config.json", help='Json config file', type=click.Path()) +@click.option('--output-file', default="../data/resources.json", help='output file', type=click.Path()) +@click.option('--custom-namespaces-file', default="./custom_namespaces.json", help='custom_namespaces file', type=click.Path()) +@click.option('--base-name', default=None, help='Base Name') +@click.option('--grouping-tag-key', default=None, help='GroupingTagKey') +@click.option('--profile', default=None, help='Profile') +def main(base_name, regions, tag, values, config_file, output_file, custom_namespaces_file, grouping_tag_key, profile): + """ Main """ + app = App(profile) + if not os.path.exists("lib/config.json"): + print('Reading from default config') + main_config = json.load(open("lib/config-example.json")) + else: + print('Reading from {config_file}') + main_config = json.load(open(config_file)) + print(main_config) + # Read from command line and parameters + base_name = base_name or main_config.get('BaseName') + grouping_tag_key = grouping_tag_key or main_config.get('GroupingTagKey') + regions = regions or main_config.get('Regions') + tag = tag or main_config.get('TagKey') + values = values or main_config.get('TagValues') + output_file = output_file or main_config.get('ResourceFile') + + # Confirm from user + base_name = inquirer.text('Enter BaseName', default=base_name or 'Application').execute() + regions = app.get_regions(default=regions) + tag = app.get_tag_key(default=tag) + values = app.get_tag_values(tag, default=values or []) + + need_scan = True + decorated_resources = [] + print(output_file) + if os.path.exists(output_file): + choice = inquirer.select( + f'Resources file was updated {time.ctime(os.path.getmtime(output_file))}', + choices=["Amend/update", "Override", "Skip scan and use previous results"], + default="Amend/update", + ).execute() + if choice == "Amend/update": + with open(output_file) as _file : + decorated_resources = json.load(_file) + account_id = boto3.client('sts').get_caller_identity()['Account'] + # clean from current account resources + account_id = app.account_id() + decorated_resources = [resource for resource in decorated_resources if account_id not in resource.get('ResourceARN', '')] + elif choice == "Override": + need_scan = True + else: + need_scan = False + + if need_scan: + if 'us-east-1' not in regions: + regions.append('us-east-1') + print('Added us-east-1 region for global services') + + for region in tqdm(regions, desc='Regions', leave=False): + config = get_config(region) + resources = get_resources(tag, values, app.session, config) + for resource in tqdm(resources, desc='Resources', leave=False): + decorated_resources.append(router(resource, app.session, config)) + + with open(output_file[1:], "w") as _file: + json.dump(decorated_resources, _file, indent=4, default=str) + print(f'output: {output_file}') + + config_file = config_file or "lib/config.json" + with open(config_file, "w") as _file: + main_config["BaseName"] = base_name + main_config["Regions"] = regions + main_config["TagKey"] = tag + main_config["TagValues"] = values + main_config["ResourceFile"] = output_file + json.dump(main_config, _file, indent=4, default=str) + print(f'config: {config_file}') + + if not os.path.exists('node_modules') and inquirer.confirm(f'Looks like node dependencies are not installed. Run `npm ic` ?', default=True).execute(): + os.system('npm ic') + + if inquirer.confirm(f'Run `cdk synth` ?', default=True).execute(): + os.system('cdk synth') + + if inquirer.confirm(f'Run `cdk deploy` ?', default=True).execute(): + os.system('cdk deploy') + +if __name__ == '__main__': + main() diff --git a/data/resource_collector.py b/data/resource_collector.py index f9976d3..38e6589 100644 --- a/data/resource_collector.py +++ b/data/resource_collector.py @@ -1,193 +1,113 @@ -import boto3 import json -import math + +import boto3 from botocore.config import Config +from tqdm import tqdm -singletons = [] +def chunks(lst, n): + """Yield successive n-sized chunks from lst.""" + for i in range(0, len(lst), n): + yield lst[i:i + n] -def get_resources(tag_name, tag_values, config): +def log(*args, **kwargs): + tqdm.write(*args, **kwargs) + +def get_resources(tag_name, tag_values, session, config): """Get resources from resource groups and tagging API. Assembles resources in a list containing only ARN and tags """ - resourcetaggingapi = boto3.client('resourcegroupstaggingapi', config=config) - resources = [] - - tags = len(tag_values) - if tags > 5: - tags_processed = 0 - while tags_processed < tags: - incremental_tag_values = tag_values[tags_processed:tags_processed+5] - resources = get_resources_from_api(resourcetaggingapi, resources, tag_name, incremental_tag_values) - tags_processed += 5 - else: - resources = get_resources_from_api(resourcetaggingapi, resources, tag_name, tag_values) - resources.extend(autoscaling_retriever(tag_name, tag_values, config)) - return resources - - -def get_resources_from_api(resourcetaggingapi, resources, tag_name, tag_values): - response = resourcetaggingapi.get_resources( - TagFilters=[ - { - 'Key': tag_name, - 'Values': tag_values - }, - ], - ResourcesPerPage=40 + return ( + get_resources_from_api(tag_name, tag_values, session, config) + + autoscaling_retriever(tag_name, tag_values, session, config) ) - resources.extend(response['ResourceTagMappingList']) - while response['PaginationToken'] != '': - print('Got the pagination token') - response = resourcetaggingapi.get_resources( - PaginationToken=response['PaginationToken'], - TagFilters=[ - { - 'Key': tag_name, - 'Values': tag_values - }, - ], - ResourcesPerPage=40 - ) - resources.extend(response['ResourceTagMappingList']) - - return resources - -def autoscaling_retriever(tag_name, tag_values, config): +def get_resources_from_api(tag_name, tag_values, session, config): + """Get resources from resource groups tagging api + """ + resourcetaggingapi = session.client('resourcegroupstaggingapi', config=config) resources = [] - tags = len(tag_values) - if tags > 5: - tags_processed = 0 - while tags_processed < tags: - incremental_tag_values = tag_values[tags_processed:tags_processed+5] - resources.extend(get_asgs_from_api(tag_name, incremental_tag_values, config)) - tags_processed += 5 - else: - resources.extend(get_asgs_from_api(tag_name, tag_values, config)) - + for chunk_of_tag_values in chunks(tag_values, 5): # api supports only 5 values + resources += list( + resourcetaggingapi.get_paginator('get_resources').paginate( + TagFilters=[{'Key': tag_name, 'Values': chunk_of_tag_values}] + ).search('ResourceTagMappingList') + ) return resources -def get_asgs_from_api(tag_name, tag_values, config): +def autoscaling_retriever(tag_name, tag_values, session, config): """Autoscaling is not supported by resource groups and tagging api - This is - :return: """ - asg = boto3.client('autoscaling', config=config) + asg = session.client('autoscaling', config=config) resources = [] - response = asg.describe_auto_scaling_groups( - Filters=[ - { - 'Name': 'tag:'+tag_name, - 'Values': tag_values - } - ], - MaxRecords=10 - ) - resources.extend(response['AutoScalingGroups']) - try: - while response['NextToken']: - response = asg.describe_auto_scaling_groups( - NextToken=response['NextToken'], - Filters=[ - { - 'Name': 'tag:'+tag_name, - 'Values': tag_values - } - ], - MaxRecords=10 - ) - resources.extend(response['AutoScalingGroups']) - except: - print(f'Done fetching autoscaling groups') - + for chunk_of_tag_values in chunks(tag_values, 5): # api supports only 5 values + resources += list( + asg.get_paginator('describe_auto_scaling_groups').paginate( + Filters=[{'Name': 'tag:' + tag_name, 'Values': chunk_of_tag_values}] + ).search('AutoScalingGroups') + ) for resource in resources: resource['ResourceARN'] = resource['AutoScalingGroupARN'] - return resources -def cw_custom_namespace_retriever(config): - """Retrieving all custom namespaces - """ - cw = boto3.client('cloudwatch', config=config) - resources = [] - response = cw.list_metrics() - for record in response['Metrics']: - if not record['Namespace'].startswith('AWS/') and not record['Namespace'].startswith('CWAgent') and record['Namespace'] not in resources: - resources.append(record['Namespace']) - print(resources) - try: - while response['NextToken']: - response = cw.list_metrics( - NextToken = response['NextToken'] - ) - for record in response['Metrics']: - if not record['Namespace'].startswith('AWS/') and not record['Namespace'].startswith('CWAgent') and record['Namespace'] not in resources: - resources.append(record['Namespace']) - print(resources) - except: - print(f'Done fetching cloudwatch namespaces') - return resources - - - - -def router(resource, config): +def router(resource, session, config): arn = resource['ResourceARN'] if ':apigateway:' in arn and '/restapis/' in arn and 'stages' not in arn: - resource = apigw1_decorator(resource, config) + resource = apigw1_decorator(resource, session, config) elif ':apigateway:' in arn and '/apis/' in arn and 'stages' not in arn: - resource = apigw2_decorator(resource, config) + resource = apigw2_decorator(resource, session, config) elif ':appsync:' in arn: - resource = appsync_decorator(resource, config) + resource = appsync_decorator(resource, session, config) elif ':rds:' in arn and ':cluster:' in arn: - resource = aurora_decorator(resource, config) + resource = aurora_decorator(resource, session, config) elif ':autoscaling:' in arn and ':autoScalingGroup:' in arn: - resource = autoscaling_decorator(resource, config) + resource = autoscaling_decorator(resource, session, config) elif ':capacity-reservation/' in arn: - resource = odcr_decorator(resource, config) + resource = odcr_decorator(resource, session, config) elif ':dynamodb:' in arn and ':table/' in arn: - resource = dynamodb_decorator(resource, config) + resource = dynamodb_decorator(resource, session, config) elif ':ec2:' in arn and ':instance/' in arn: - resource = ec2_decorator(resource, config) + tmpresource = ec2_decorator(resource, session, config) + if len(tmpresource) > 0: + resource = tmpresource elif 'lambda' in arn and 'function' in arn: - resource = lambda_decorator(resource, config) + resource = lambda_decorator(resource, session, config) elif 'elasticloadbalancing' in arn and '/net/' not in arn and '/app/' not in arn and ':targetgroup/' not in arn: - resource = elb1_decorator(resource, config) + resource = elb1_decorator(resource, session, config) elif 'elasticloadbalancing' in arn and ( '/net/' in arn or '/app/' in arn ) and ':targetgroup/' not in arn: - resource = elb2_decorator(resource, config) + resource = elb2_decorator(resource, session, config) elif ':ecs:' in arn and ':cluster/' in arn: - resource = ecs_decorator(resource, config) + resource = ecs_decorator(resource, session, config) elif ':natgateway/' in arn and ':ec2:' in arn: - resource = natgw_decorator(resource, config) + resource = natgw_decorator(resource, session, config) elif ':transit-gateway/' in arn and ':ec2:' in arn: - resource = tgw_decorator(resource, config) + resource = tgw_decorator(resource, session, config) elif ':sqs:' in arn: - resource = sqs_decorator(resource, config) + resource = sqs_decorator(resource, session, config) elif 'arn:aws:s3:' in arn: - resource = s3_decorator(resource, config) + resource = s3_decorator(resource, session, config) elif ':sns:' in arn: - resource = sns_decorator(resource, config) + resource = sns_decorator(resource, session, config) elif ':cloudfront:' in arn and ':distribution/' in arn: - resource = cloudfront_decorator(resource, config) + resource = cloudfront_decorator(resource, session, config) elif ':elasticache:' in arn: - resource = elasticache_decorator(resource, config) + resource = elasticache_decorator(resource, session, config) elif ':mediapackage:' in arn and ':channels/' in arn: - resource = mediapackage_decorator(resource, config) + resource = mediapackage_decorator(resource, session, config) elif ':medialive:' in arn and ':channel:' in arn: - resource = medialive_decorator(resource, config) + resource = medialive_decorator(resource, session, config) elif ':elasticfilesystem:' in arn: - resource = efs_decorator(resource, config) + resource = efs_decorator(resource, session, config) elif 'arn:aws:elasticbeanstalk:' in arn: - resource = beanstalk_decorator(resource,config) + resource = beanstalk_decorator(resource,session, config) return resource -def apigw1_decorator(resource, config): - print(f'This resource is API Gateway 1 {resource["ResourceARN"]}') +def apigw1_decorator(resource, session, config): + log(f'This resource is API Gateway 1 {resource["ResourceARN"]}') apiid = resource['ResourceARN'].split('/')[len(resource['ResourceARN'].split('/'))-1] - apigw = boto3.client('apigateway', config=config) + apigw = session.client('apigateway', config=config) response = apigw.get_rest_api( restApiId=apiid ) @@ -200,10 +120,10 @@ def apigw1_decorator(resource, config): resource['stages'] = response2['item'] return resource -def apigw2_decorator(resource, config): - print(f'This resource is API Gateway 2 {resource["ResourceARN"]}') +def apigw2_decorator(resource, session, config): + log(f'This resource is API Gateway 2 {resource["ResourceARN"]}') apiid = resource['ResourceARN'].split('/')[len(resource['ResourceARN'].split('/')) - 1] - apigw = boto3.client('apigatewayv2', config=config) + apigw = session.client('apigatewayv2', config=config) response = apigw.get_api( ApiId=apiid ) @@ -215,10 +135,10 @@ def apigw2_decorator(resource, config): return resource -def appsync_decorator(resource, config): - print(f'This resource is AppSync {resource["ResourceARN"]}') +def appsync_decorator(resource, session, config): + log(f'This resource is AppSync {resource["ResourceARN"]}') apiid = resource['ResourceARN'].split('/')[len(resource['ResourceARN'].split('/')) - 1] - appsync = boto3.client('appsync', config=config) + appsync = session.client('appsync', config=config) response = appsync.get_graphql_api( apiId=apiid ) @@ -231,10 +151,10 @@ def appsync_decorator(resource, config): return resource -def aurora_decorator(resource, config): - print(f'This resource is Aurora {resource["ResourceARN"]}') +def aurora_decorator(resource, session, config): + log(f'This resource is Aurora {resource["ResourceARN"]}') clusterid = resource['ResourceARN'].split(':')[len(resource['ResourceARN'].split(':')) - 1] - rds = boto3.client('rds', config=config) + rds = session.client('rds', config=config) try: response = rds.describe_db_clusters( DBClusterIdentifier=clusterid @@ -252,21 +172,21 @@ def aurora_decorator(resource, config): resource['Iops'] = response['DBClusters'][0]['Iops'] resource['PerformanceInsightsEnabled'] = response['DBClusters'][0]['PerformanceInsightsEnabled'] except: - print('Just aurora-resource') + log('Just aurora-resource') return resource -def autoscaling_decorator(resource, config): - print(f'This resource is Autoscaling Group {resource["ResourceARN"]}') +def autoscaling_decorator(resource, session, config): + log(f'This resource is Autoscaling Group {resource["ResourceARN"]}') return resource -def beanstalk_decorator(resource, config): +def beanstalk_decorator(resource, session, config): return resource -def cloudfront_decorator(resource, config): - print(f'This resource is CloudFront distribution') - client = boto3.client('cloudfront', config=config) +def cloudfront_decorator(resource, session, config): + log(f'This resource is CloudFront distribution') + client = session.client('cloudfront', config=config) response = client.get_distribution( Id = resource['ResourceARN'].split('/')[len(resource['ResourceARN'].split('/'))-1] ) @@ -277,10 +197,11 @@ def cloudfront_decorator(resource, config): resource['Origins'] = response['Distribution']['DistributionConfig']['Origins'] return resource -def mediapackage_decorator(resource, config): - print(f'this resource is Mediapackage channel') + +def mediapackage_decorator(resource, session, config): + log(f'this resource is Mediapackage channel') arn = resource['ResourceARN'] - client = boto3.client('mediapackage', config=config) + client = session.client('mediapackage', config=config) response = client.list_channels( MaxResults=40, @@ -296,11 +217,12 @@ def mediapackage_decorator(resource, config): resource ['IngestEndpoint'] = response2['HlsIngest']['IngestEndpoints'] resource['OriginEndpoint'] = origin_endpoint['OriginEndpoints'] return resource - -def medialive_decorator(resource, config): - print(f'this resource is Medialive channel') + + +def medialive_decorator(resource, session, config): + log(f'this resource is Medialive channel') arn = resource['ResourceARN'] - client = boto3.client('medialive', config=config) + client = session.client('medialive', config=config) response = client.list_channels( MaxResults=40, ) @@ -313,16 +235,17 @@ def medialive_decorator(resource, config): ) resource['Pipeline'] = response2['PipelineDetails'] return resource - -def odcr_decorator(resource, config): - print(f'This resource is ODCR {resource["ResourceARN"]}') + + +def odcr_decorator(resource, session, config): + log(f'This resource is ODCR {resource["ResourceARN"]}') return resource -def dynamodb_decorator(resource, config): - print(f'This resource is DynamoDB {resource["ResourceARN"]}') +def dynamodb_decorator(resource, session, config): + log(f'This resource is DynamoDB {resource["ResourceARN"]}') tablename = resource['ResourceARN'].split('/')[len(resource['ResourceARN'].split('/'))-1] - ddb = boto3.client('dynamodb', config=config) + ddb = session.client('dynamodb', config=config) response = ddb.describe_table( TableName=tablename ) @@ -339,10 +262,10 @@ def dynamodb_decorator(resource, config): resource['rcu'] = rcu return resource -def efs_decorator(resource, config): - print(f'This resource is EFS {resource["ResourceARN"]}') +def efs_decorator(resource, session, config): + log(f'This resource is EFS {resource["ResourceARN"]}') fsId = resource['ResourceARN'].split('/')[len(resource['ResourceARN'].split('/'))-1] - efs = boto3.client('efs', config=config) + efs = session.client('efs', config=config) response = efs.describe_file_systems( FileSystemId=fsId ) @@ -351,31 +274,33 @@ def efs_decorator(resource, config): return resource -def ec2_decorator(resource, config): - print(f'This resource is EC2 {resource["ResourceARN"]}') +def ec2_decorator(resource, session, config): + log(f'This resource is EC2 {resource["ResourceARN"]}') instanceid = resource['ResourceARN'].split('/')[len(resource['ResourceARN'].split('/'))-1] - ec2 = boto3.client('ec2', config=config) + ec2 = session.client('ec2', config=config) - volumes = [] - - response = ec2.describe_volumes( - Filters=[ - { - 'Name': 'attachment.instance-id', - 'Values': [ - instanceid, + response = ec2.describe_instances( + Filters=[ + { + 'Name': 'instance-id', + 'Values': [ + instanceid + ] + } ] - }, - ], - MaxResults=100 - ) + ) + reservations = response.get('Reservations', []) + if len(reservations) > 0: + instances = reservations[0].get('Instances', []) + instance = instances[0] - for record in response['Volumes']: - volumes.append(record) + resource['Instance'] = instance + + if 'State' in resource['Instance'] and 'Name' in resource['Instance']['State'] and resource['Instance']['State']['Name'] != 'terminated': + print('This instance is not terminated') + volumes = [] - try: - while response['NextToken']: response = ec2.describe_volumes( Filters=[ { @@ -385,58 +310,74 @@ def ec2_decorator(resource, config): ] }, ], - MaxResults=100, - NextToken=response['NextToken'] + MaxResults=100 ) + for record in response['Volumes']: volumes.append(record) - except: - print(f'Done fetching volumes') - - resource['Volumes'] = volumes + try: + while response['NextToken']: + response = ec2.describe_volumes( + Filters=[ + { + 'Name': 'attachment.instance-id', + 'Values': [ + instanceid, + ] + }, + ], + MaxResults=100, + NextToken=response['NextToken'] + ) + for record in response['Volumes']: + volumes.append(record) + + except: + log(f'Done fetching volumes') + + resource['Volumes'] = volumes + + + instanceType = resource['Instance']['InstanceType'] + + if 't2' in instanceType or 't3' in instanceType or 't4' in instanceType: + response = ec2.describe_instance_credit_specifications( + InstanceIds=[instanceid] + ) + resource['CPUCreditSpecs'] = response['InstanceCreditSpecifications'][0] + + cw = session.client('cloudwatch', config=config) + results = cw.get_paginator('list_metrics') + for response in results.paginate( + Namespace='CWAgent', + Dimensions=[ + {'Name': 'InstanceId', 'Value': instanceid} + ], ): + if len(response['Metrics']) > 0: + log(f'Instance {instanceid} has CWAgent') + resource['CWAgent'] = 'True' + resource['CWAgentMetrics'] = response['Metrics'] + else: + log(f'Instance {instanceid} does not have CWAgent') + resource['CWAgent'] = 'False' + + return resource + else: + print('This instance is terminated, ignoring') + return [] + else: + print('Resource is not found') + return [] - response = ec2.describe_instances( - Filters=[ - { - 'Name': 'instance-id', - 'Values': [ - instanceid - ] - } - ] - ) - resource['Instance'] = response['Reservations'][0]['Instances'][0] - instanceType = resource['Instance']['InstanceType'] - if 't2' in instanceType or 't3' in instanceType or 't4' in instanceType: - response = ec2.describe_instance_credit_specifications( - InstanceIds=[instanceid] - ) - resource['CPUCreditSpecs'] = response['InstanceCreditSpecifications'][0] - - cw = boto3.client('cloudwatch', config=config) - results = cw.get_paginator('list_metrics') - for response in results.paginate( - MetricName='mem_used_percent', - Namespace='CWAgent', - Dimensions=[ - {'Name': 'InstanceId', 'Value': instanceid} - ], ): - if len(response['Metrics']) > 0: - print(f'Instance {instanceid} has CWAgent') - resource['CWAgent'] = 'True' - else: - print(f'Instance {instanceid} does not have CWAgent') - resource['CWAgent'] = 'False' - return resource -def elasticache_decorator(resource, config): - print(f'This resource is Elasticache {resource["ResourceARN"]}') +def elasticache_decorator(resource, session, config): + log(f'This resource is Elasticache {resource["ResourceARN"]}') if ':cluster:' in resource['ResourceARN']: clusterid = resource['ResourceARN'].split(':')[len(resource['ResourceARN'].split(':'))-1] - client = boto3.client('elasticache', config=config) + client = session.client('elasticache', config=config) response = client.describe_cache_clusters( CacheClusterId=clusterid ) @@ -454,10 +395,10 @@ def elasticache_decorator(resource, config): return resource -def lambda_decorator(resource, config): - print(f'This resource is Lambda {resource["ResourceARN"]}') +def lambda_decorator(resource, session, config): + log(f'This resource is Lambda {resource["ResourceARN"]}') functionname = resource['ResourceARN'].split(':')[len(resource['ResourceARN'].split(':')) - 1] - lambdaclient = boto3.client('lambda', config=config) + lambdaclient = session.client('lambda', config=config) response = lambdaclient.get_function( FunctionName=functionname ) @@ -465,10 +406,10 @@ def lambda_decorator(resource, config): return resource -def elb1_decorator(resource, config): - print(f'This resource is ELBv1 {resource["ResourceARN"]}') +def elb1_decorator(resource, session, config): + log(f'This resource is ELBv1 {resource["ResourceARN"]}') elbname = resource['ResourceARN'].split('/')[len(resource['ResourceARN'].split('/'))-1] - elb = boto3.client('elb', config=config) + elb = session.client('elb', config=config) response = elb.describe_load_balancers( LoadBalancerNames=[ elbname @@ -478,9 +419,9 @@ def elb1_decorator(resource, config): return resource -def elb2_decorator(resource, config): - print(f'This resource is ELBv2 {resource["ResourceARN"]}') - elb = boto3.client('elbv2', config=config) +def elb2_decorator(resource, session, config): + log(f'This resource is ELBv2 {resource["ResourceARN"]}') + elb = session.client('elbv2', config=config) response = elb.describe_load_balancers( LoadBalancerArns=[ resource['ResourceARN'] @@ -494,9 +435,9 @@ def elb2_decorator(resource, config): return resource -def ecs_decorator(resource, config): - print(f'This resource is ECS {resource["ResourceARN"]}') - ecs = boto3.client('ecs', config=config) +def ecs_decorator(resource, session, config): + log(f'This resource is ECS {resource["ResourceARN"]}') + ecs = session.client('ecs', config=config) response = ecs.describe_clusters( clusters=[ resource['ResourceARN'] @@ -522,7 +463,7 @@ def ecs_decorator(resource, config): for lb in service['loadBalancers']: target_groups.append(lb['targetGroupArn']) - elb = boto3.client('elbv2', config=config) + elb = session.client('elbv2', config=config) for target_group in target_groups: response = elb.describe_target_health( TargetGroupArn=target_group @@ -538,20 +479,20 @@ def ecs_decorator(resource, config): return resource -def natgw_decorator(resource, config): - print(f'This resource is NAT-gw {resource["ResourceARN"]}') +def natgw_decorator(resource, session, config): + log(f'This resource is NAT-gw {resource["ResourceARN"]}') return resource -def rds_decorator(resource, config): - print(f'This resource is RDS {resource["ResourceARN"]}') +def rds_decorator(resource, session, config): + log(f'This resource is RDS {resource["ResourceARN"]}') return resource -def s3_decorator(resource, config): +def s3_decorator(resource, session, config): bucket_name = resource['ResourceARN'].split(':')[len(resource['ResourceARN'].split(':'))-1] resource['BucketName'] = bucket_name - print(f'This resource {bucket_name} is S3 bucket') - s3client = boto3.client('s3', config=config) + log(f'This resource {bucket_name} is S3 bucket') + s3client = session.client('s3', config=config) try: encryption_request = s3client.get_bucket_encryption( Bucket=bucket_name @@ -578,10 +519,10 @@ def s3_decorator(resource, config): return resource -def sqs_decorator(resource, config): - print(f'This resource is SQS {resource["ResourceARN"]}') +def sqs_decorator(resource, session, config): + log(f'This resource is SQS {resource["ResourceARN"]}') queueName = resource['ResourceARN'].split(':')[len(resource['ResourceARN'].split(':'))-1] - sqs = boto3.client('sqs', config=config) + sqs = session.client('sqs', config=config) response = sqs.get_queue_url( QueueName=queueName ) @@ -592,9 +533,9 @@ def sqs_decorator(resource, config): resource['Attributes'] = response['Attributes'] return resource -def sns_decorator(resource, config): - print(f'This resource is SNS {resource["ResourceARN"]}') -# sns = boto3.client('sns', config=config) +def sns_decorator(resource, session, config): + log(f'This resource is SNS {resource["ResourceARN"]}') +# sns = session.client('sns', config=config) # response = sns.get_topic_attributes( # TopicArn=resource['ResourceARN'] # ) @@ -604,10 +545,10 @@ def sns_decorator(resource, config): return resource -def tgw_decorator(resource, config): - print(f'This resource is TGW {resource["ResourceARN"]}') +def tgw_decorator(resource, session, config): + log(f'This resource is TGW {resource["ResourceARN"]}') tgwid = resource['ResourceARN'].split('/')[len(resource['ResourceARN'].split('/'))-1] - tgw = boto3.client('ec2', config=config) + tgw = session.client('ec2', config=config) response = tgw.describe_transit_gateway_attachments( Filters=[{ 'Name': 'transit-gateway-id', @@ -622,7 +563,7 @@ def tgw_decorator(resource, config): def debug(resource): - print(json.dumps(resource, indent=4, default=str)) + log(json.dumps(resource, indent=4, default=str)) def get_config(region): return Config( @@ -641,55 +582,56 @@ def handler(): output_file = "resources.json" custom_namespace_file = "custom_namespaces.json" try: - f = open("../lib/config.json", "r") + f = open("../lib/session.json", "r") main_config = json.load(f) except: - print("Could not find config file!!! You should run this from 'data' directory!") + log("Could not find session file!!! You should run this from 'data' directory!") quit() try: if main_config['ResourceFile']: output_file = main_config['ResourceFile'] except: - print('No ResourceFile configured using default') + log('No ResourceFile configured using default') try: if main_config['TagKey']: tag_name = main_config['TagKey'] except: - print('No tag key configured') + log('No tag key configured') try: if main_config['TagValues']: tag_values = main_config['TagValues'] except: - print('No tag values configured') + log('No tag values configured') try: if main_config['Regions']: regions = main_config['Regions'] except: - print('No regions configured') + log('No regions configured') try: if main_config['CustomNamespaceFile']: custom_namespace_file = main_config['CustomNamespaceFile'] except: - print('No custom namespaces configured') + log('No custom namespaces configured') decorated_resources = [] region_namespaces = {'RegionNamespaces': []} if 'us-east-1' not in regions: regions.append('us-east-1') - print('Added us-east-1 region for global services') + log('Added us-east-1 region for global services') for region in regions: config = get_config(region) - resources = get_resources(tag_name, tag_values, config) - region_namespace = {'Region': region, 'Namespaces' : cw_custom_namespace_retriever(config) } + session = boto3.Session(profile_name=None) + resources = get_resources(tag_name, tag_values, session, config) + region_namespace = {'Region': region, 'Namespaces' : cw_custom_namespace_retriever(session, config) } region_namespaces['RegionNamespaces'].append(region_namespace) for resource in resources: - decorated_resources.append(router(resource, config)) + decorated_resources.append(router(resource, session, config)) cn = open(custom_namespace_file, "w") cn.write(json.dumps(region_namespaces, indent=4, default=str)) cn.close() diff --git a/lib/config-example.json b/lib/config-example.json new file mode 100644 index 0000000..843aa10 --- /dev/null +++ b/lib/config-example.json @@ -0,0 +1,21 @@ +{ + "BaseName": "Application", + "ResourceFile": "../data/resources.json", + "TagKey": "Name", + "TagValues": [], + "Regions": [], + "GroupingTagKey": "groupby", + "CustomEC2TagKeys": [], + "CustomNamespaceFile": "./data/custom_namespaces.json", + "Compact": true, + "CompactMaxResourcesPerWidget": 10, + "AlarmTopic": "", + "AlarmDashboard": { + "enabled": false, + "organizationId": "", + "alarmViewListSize": 100 + }, + "MetricDashboards": { + "enabled": true + } +} \ No newline at end of file diff --git a/lib/config.json b/lib/config.json deleted file mode 100644 index fee66e2..0000000 --- a/lib/config.json +++ /dev/null @@ -1,21 +0,0 @@ -{ - "BaseName": "Application", - "ResourceFile": "../data/resources.json", - "TagKey": "iem", - "TagValues": ["202202","202102"], - "Regions": ["eu-west-1"], - "GroupingTagKey": "groupby", - "CustomEC2TagKeys": ["Add","Your","TagKeys", "Here"], - "CustomNamespaceFile": "../data/custom_namespaces.json", - "Compact": false, - "CompactMaxResourcesPerWidget": 10, - "AlarmTopic": "", - "AlarmDashboard": { - "enabled": false, - "organizationId": "", - "alarmViewListSize": 100 - }, - "MetricDashboards": { - "enabled": true - } -} diff --git a/lib/iem-dashboard-stack.ts b/lib/iem-dashboard-stack.ts index 63ae68c..57c00b6 100644 --- a/lib/iem-dashboard-stack.ts +++ b/lib/iem-dashboard-stack.ts @@ -9,7 +9,7 @@ export class IemDashboardStack extends Stack { constructor(scope: Construct, id: string, props?: StackProps) { super(scope, id, props); - const dashboard = new Dashboard(this,config.BaseName,{ + const dashboard = new Dashboard(this, config.BaseName, { dashboardName: config.BaseName + '-Dashboard' }); @@ -17,8 +17,9 @@ export class IemDashboardStack extends Stack { try { resources = require(config.ResourceFile); console.log(`LOADED RESOURCE FILE ${config.ResourceFile}`); - } catch { - console.log(`ERROR: ${config.ResourceFile} not found, run 'cd data; python resource_collector.py'`); + } catch (error) { + console.error(`An error occurred: ${error.message}`); + console.error(`ERROR: file ${config.ResourceFile} not found, run 'cd data; python resource_collector.py'`); } const graphFactory = new GraphFactory(this,'GraphFactory',resources, config); diff --git a/lib/services/graphfactory.ts b/lib/services/graphfactory.ts index 8db93a6..7485347 100644 --- a/lib/services/graphfactory.ts +++ b/lib/services/graphfactory.ts @@ -32,6 +32,7 @@ import {MediaLiveWidgetSet} from "./servicewidgetsets/medialive"; import {EFSWidgetSet} from "./servicewidgetsets/efs"; import {SnsAction} from "aws-cdk-lib/aws-cloudwatch-actions"; import * as sns from "aws-cdk-lib/aws-sns"; +import {Ec2InstanceGroupWidgetSet} from "./servicewidgetsets/ec2group"; export class GraphFactory extends Construct { serviceArray:any=[]; @@ -148,6 +149,7 @@ export class GraphFactory extends Construct { })) for (const resource of this.serviceArray[region][servicekey]) { let apiid = resource.ResourceARN.split('/')[resource.ResourceARN.split('/').length - 1] + console.log(`APIGWV1WidgetSet-${apiid}-${region}-${this.config.BaseName}`); let apigw = new ApiGatewayV1WidgetSet(this, `APIGWV1WidgetSet-${apiid}-${region}-${this.config.BaseName}`, resource, this.config); for (const widgetSet of apigw.getWidgetSets()) { this.widgetArray.push(widgetSet); @@ -225,7 +227,12 @@ export class GraphFactory extends Construct { case "ec2instances": { //We create the dashboard only if we actually have EC2s in the workload - this.processEC2(region, servicekey); + if (this.config.Compact){ + this.processCompactEC2(region, servicekey); + } else { + this.processEC2(region, servicekey); + } + break; } case "lambda": { @@ -594,11 +601,14 @@ export class GraphFactory extends Construct { this.serviceArray[region]["elasticfilesystem"].push(resource); } }else if (resource.ResourceARN.includes(':ec2:') && resource.ResourceARN.includes(':instance/')) { - if (!this.serviceArray[region]["ec2instances"]) { - this.serviceArray[region]["ec2instances"] = [resource]; - } else { - this.serviceArray[region]["ec2instances"].push(resource); + if ( resource.Instance && resource.Instance.State && resource.Instance.State.Name != 'terminated'){ + if (!this.serviceArray[region]["ec2instances"]) { + this.serviceArray[region]["ec2instances"] = [resource]; + } else { + this.serviceArray[region]["ec2instances"].push(resource); + } } + } else if (resource.ResourceARN.includes(':lambda:') && resource.ResourceARN.includes(':function:')) { if (!this.serviceArray[region]["lambda"]) { this.serviceArray[region]["lambda"] = [resource]; @@ -765,6 +775,90 @@ export class GraphFactory extends Construct { } + private processCompactEC2(region: string, servicekey: any) { + const resourceGroups = new Map>(); + + for (const resource of this.serviceArray[region][servicekey]) { + let groupName = 'default'; + + if (this.groupResourcesByTag) { + for (const tag of resource.Tags) { + if (tag.Key === this.config.GroupingTagKey) { + tag.Value = tag.Value.replace(/\s/g, ''); + groupName = tag.Value; + break; + } + } + } + + if (!resourceGroups.has(groupName)) { + resourceGroups.set(groupName, []); + } + resourceGroups?.get(groupName)?.push(resource); + } + + const instancesPerWidget = Math.min( + 100, + this.config.CompactMaxResourcesPerWidget + ); + + for (const [key, instances] of resourceGroups) { + console.log(`processing key ${key}`); + let dashboard: any = new Dashboard( + this, + `${this.config.BaseName}-EC2-${key}-${region}`, + { + dashboardName: `${this.config.BaseName}-EC2-${key}-${region}` + } + ); + this.estimatedCost += 3; + let widgetSet: any = []; + let alarmSet: any = []; + + if (instances) { + let instancesRemaining = 0; + if (instances.length && instances.length > 0) { + instancesRemaining = instances.length; + } + let offset = 0; + while (instancesRemaining > 0) { + let instanceIncrement = instances.splice( + 0, + instancesPerWidget + ); + let instanceSet = new Ec2InstanceGroupWidgetSet( + this, + `EC2-${key}-${region}-${offset}-${this.config.BaseName}`, + instanceIncrement, + this.config + ); + for (let widget of instanceSet.getWidgetSets()) { + widgetSet.push(widget); + } + alarmSet = alarmSet.concat(instanceSet.getAlarmSet()); + instancesRemaining -= instancesPerWidget; + offset += 1; + } + } + + if (alarmSet.length > 0) { + this.estimatedCost += alarmSet.length * 0.1; + const height = + 1 + Math.floor(alarmSet.length / 4) + (alarmSet.length % 4 != 0 ? 1 : 0); + const ec2AlarmStatusWidget = new AlarmStatusWidget({ + title: 'Alarms', + width: 24, + height: height, + alarms: alarmSet + }); + widgetSet = [ec2AlarmStatusWidget].concat(widgetSet); + } + for (const widgetSetElement of widgetSet) { + dashboard.addWidgets(widgetSetElement); + } + } + } + private processLambda(region: string, servicekey: any) { for (const resource of this.serviceArray[region][servicekey]) { let lambda = new LambdaWidgetSet(this, `Lambda-WS-${resource.Configuration.FunctionName}-${region}-${this.config.BaseName}`, resource, this.config); diff --git a/lib/services/servicewidgetsets/ec2group.ts b/lib/services/servicewidgetsets/ec2group.ts index 6c84f31..c61a3d3 100644 --- a/lib/services/servicewidgetsets/ec2group.ts +++ b/lib/services/servicewidgetsets/ec2group.ts @@ -16,7 +16,7 @@ export class Ec2InstanceGroupWidgetSet extends Construct implements WidgetSet { widgetSet:any = []; alarmSet:any = []; - constructor(scope: Construct, id: string, resource:any) { + constructor(scope: Construct, id: string, resource:any, config:any) { super(scope, id) let region = resource[0].ResourceARN.split(':')[3]; this.widgetSet.push(new TextWidget({ @@ -51,7 +51,16 @@ export class Ec2InstanceGroupWidgetSet extends Construct implements WidgetSet { region: region, left: cpuUtilMetricArray, right: [averageCpuMetric], - width: 12 + width: 12, + height: 8, + leftYAxis: { + min: 0, + max: 100 + }, + rightYAxis: { + min: 0, + max: 100 + } }) const avgCpuAlarm = averageCpuMetric.createAlarm(this,'AvgEC2CpuAlarm-' + region,{ @@ -69,12 +78,55 @@ export class Ec2InstanceGroupWidgetSet extends Construct implements WidgetSet { region: region, left: networkInMetricArray, right: networkOUtMetricArray, - width: 12 + width: 12, + height: 8, + leftYAxis: { + label: 'Network In', + }, + rightYAxis: { + label: 'Network Out', + } }) this.alarmSet.push(avgCpuAlarm) this.widgetSet.push(new Row(cpuwidget,netwidget)) this.widgetSet.push(new Row(ebsWriteBytesWidget,ebsReadBytesWidget)); + + const cwagentInstances = this.getCWAgentInstances(resource); + if ( cwagentInstances.length > 0 ){ + const memorywidget = new GraphWidget({ + title: 'Memory Utilisation', + region: region, + left: this.getCWAgentMetricArray(cwagentInstances,'mem_used_percent'), + width: 12, + height: 8, + leftYAxis: { + min: 0, + max: 100 + }, + rightYAxis: { + min: 0, + max: 100 + } + }); + + const diskwidget = new GraphWidget({ + title: 'Disk Utilisation', + region: region, + left: this.getCWAgentMetricArray(cwagentInstances, 'disk_used_percent'), + width: 12, + height: 8, + leftYAxis: { + min: 0, + max: 100 + }, + rightYAxis: { + min: 0, + max: 100 + } + }); + this.widgetSet.push(new Row(memorywidget, diskwidget)); + } } private getMetricArray(instances:any,metric:string,period?:Duration,statistic?:Statistic){ @@ -102,6 +154,84 @@ export class Ec2InstanceGroupWidgetSet extends Construct implements WidgetSet { return metricarray; } + private getCWAgentMetricArray(instances:any,metric:string,period?:Duration,statistic?:Statistic){ + let metricarray:Metric[] = []; + let metricperiod = Duration.minutes(1); + let metricstatistic = Statistic.SUM + if ( period ){ + metricperiod = period; + } + if ( statistic ){ + metricstatistic = statistic; + } + + for (let instance of instances){ + let instanceId = instance.ResourceARN.split('/')[instance.ResourceARN.split('/').length - 1]; + for ( let CWAgentMetric of instance.CWAgentMetrics){ + + if ( CWAgentMetric['MetricName'] == metric ){ + let path:any = false; + if ( metric == 'disk_used_percent'){ + for (const dimension of CWAgentMetric['Dimensions']){ + if ( dimension['Name'] === 'path' ){ + + if ( ! dimension['Value'].includes('/proc') + && ! dimension['Value'].includes('/sys') + && ! dimension['Value'].includes('/dev') + && ! dimension['Value'].includes('/run') ){ + path = dimension['Value']; + + metricarray.push(new Metric({ + namespace: 'CWAgent', + label: `${instanceId}-${path}`, + metricName: CWAgentMetric['MetricName'], + dimensionsMap: this.generateDimensionMap(CWAgentMetric), + statistic: metricstatistic, + period:metricperiod, + })); + + } + + + } + } + } else { + metricarray.push(new Metric({ + namespace: 'CWAgent', + label: `${instanceId}`, + metricName: CWAgentMetric['MetricName'], + dimensionsMap: this.generateDimensionMap(CWAgentMetric), + statistic: metricstatistic, + period:metricperiod, + })); + } + + + } + } + + } + return metricarray; + } + + private generateDimensionMap(agentMetric:any){ + let dimensionMap:any = {}; + for (const dimension of agentMetric['Dimensions']){ + dimensionMap[dimension['Name']] = dimension['Value']; + } + return dimensionMap; + } + + private getCWAgentInstances(instances:any){ + let cwagentInstanceArray:any[] = []; + for (const instance of instances) { + if (instance['CWAgentMetrics']){ + cwagentInstanceArray.push(instance); + } + } + return cwagentInstanceArray; + } + getWidgetSets(){ return this.widgetSet; } diff --git a/requirements.txt b/requirements.txt index 94bbd78..d14b683 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,2 +1,4 @@ boto3 -botocore \ No newline at end of file +botocore +tqdm +InquirerPy diff --git a/screenshots/Architecture-Alerts.png b/screenshots/Architecture-Alerts.png new file mode 100644 index 0000000..5b1daf6 Binary files /dev/null and b/screenshots/Architecture-Alerts.png differ diff --git a/screenshots/Architecture-Dashboards-MultiAccount.png b/screenshots/Architecture-Dashboards-MultiAccount.png new file mode 100644 index 0000000..df793af Binary files /dev/null and b/screenshots/Architecture-Dashboards-MultiAccount.png differ diff --git a/screenshots/Architecture-Dashboards-SingleAccount.png b/screenshots/Architecture-Dashboards-SingleAccount.png new file mode 100644 index 0000000..7830d65 Binary files /dev/null and b/screenshots/Architecture-Dashboards-SingleAccount.png differ diff --git a/stack_sets/event_forwarder_template.yaml b/stack_sets/event_forwarder_template.yaml index 0acd152..78cbefd 100644 --- a/stack_sets/event_forwarder_template.yaml +++ b/stack_sets/event_forwarder_template.yaml @@ -1,5 +1,17 @@ AWSTemplateFormatVersion: '2010-09-09' +Parameters: + + CentralBusARN: + Type: String + Description: ARN of EventBus in the central Account + Default: REPLACE_WITH_CENTRAL_BUS_ARN + + LambdaRoleARN: + Type: String + Description: Central Lambda Function ARN + Default: REPLACE_WITH_LAMBDA_ROLE_ARN + Resources: CentralEventBusForwardingRole: @@ -19,7 +31,7 @@ Resources: Statement: - Effect: 'Allow' Action: 'events:PutEvents' - Resource: 'REPLACE_WITH_CENTRAL_BUS_ARN' + Resource: !Ref CentralBusARN AlarmStateChangeEventRule: Type: 'AWS::Events::Rule' @@ -32,7 +44,7 @@ Resources: - 'CloudWatch Alarm State Change' State: 'ENABLED' Targets: - - Arn: 'REPLACE_WITH_CENTRAL_BUS_ARN' + - Arn: !Ref CentralBusARN Id: 'Target1' RoleArn: !GetAtt [ CentralEventBusForwardingRole, Arn ] @@ -44,7 +56,7 @@ Resources: - Action: sts:AssumeRole Effect: Allow Principal: - AWS: 'REPLACE_WITH_LAMBDA_ROLE_ARN' + AWS: !Ref LambdaRoleARN Version: "2012-10-17" Description: Role used by central Alarm event augmentation Lambda function RoleName: !Sub "CrossAccountAlarmAugmentationAssumeRole-${AWS::Region}"