1 00:00:00,000 --> 00:00:02,307 So now let's talk about VPC flow logs. 2 00:00:02,307 --> 00:00:04,620 VPC flow logs allow you to capture information 3 00:00:04,620 --> 00:00:07,050 from IP traffic going into your interfaces. 4 00:00:07,050 --> 00:00:08,880 So that could be at the VPC level, 5 00:00:08,880 --> 00:00:10,170 the subnet level, 6 00:00:10,170 --> 00:00:13,050 or the Elastic Network Interface, ENI, level. 7 00:00:13,050 --> 00:00:15,390 So we have three kinds of flow logs. 8 00:00:15,390 --> 00:00:16,890 They're very helpful to monitor 9 00:00:16,890 --> 00:00:18,840 and troubleshoot connectivity issues 10 00:00:18,840 --> 00:00:20,610 happening within your VPC. 11 00:00:20,610 --> 00:00:22,590 So these flow logs can be sent to two destinations. 12 00:00:22,590 --> 00:00:25,470 They can be sent to Amazon S3 and CloudWatch Logs. 13 00:00:25,470 --> 00:00:26,850 And they will capture information 14 00:00:26,850 --> 00:00:29,280 for also AWS managed interfaces, 15 00:00:29,280 --> 00:00:33,810 such as ELB, RDS, ElastiCache, Redshift, WorkSpaces, 16 00:00:33,810 --> 00:00:36,450 your NAT Gateway, your Transit Gateway, and so on. 17 00:00:36,450 --> 00:00:38,280 So in this graph, well, it's very simple, 18 00:00:38,280 --> 00:00:40,290 your flow logs are at the VPC level, for example, 19 00:00:40,290 --> 00:00:41,820 or subnets or ENI level, 20 00:00:41,820 --> 00:00:44,640 and they're sent to CloudWatch and Amazon S3. 21 00:00:44,640 --> 00:00:47,760 So this is what a VPC flow log looks like. 22 00:00:47,760 --> 00:00:49,920 So there is a format associated with it, 23 00:00:49,920 --> 00:00:52,860 but there is the version, account ID, interface ID, 24 00:00:52,860 --> 00:00:54,540 source address, destination address, 25 00:00:54,540 --> 00:00:56,490 source port, destination port, protocol, 26 00:00:56,490 --> 00:01:00,330 packets, bytes, start, end, action, and log status. 27 00:01:00,330 --> 00:01:02,460 So this is metadata about the network packets 28 00:01:02,460 --> 00:01:05,160 going into your VPC. 29 00:01:05,160 --> 00:01:07,650 And you don't need to remember this at a high level, right? 30 00:01:07,650 --> 00:01:10,680 But let's have a look at what information we can get 31 00:01:10,680 --> 00:01:11,790 out of these flow logs. 32 00:01:11,790 --> 00:01:13,500 So the source address and the destination address 33 00:01:13,500 --> 00:01:15,780 help identify problematic IP. 34 00:01:15,780 --> 00:01:18,540 This is where you see if an IP is repeatedly being denied. 35 00:01:18,540 --> 00:01:19,830 Maybe there is something wrong with an IP 36 00:01:19,830 --> 00:01:22,380 or maybe you're being attacked by a specific IP. 37 00:01:22,380 --> 00:01:24,030 Source port and destination port 38 00:01:24,030 --> 00:01:26,160 help you identify the problematic ports. 39 00:01:26,160 --> 00:01:29,460 Action is going to be either accept or reject, 40 00:01:29,460 --> 00:01:30,750 and so it's going to say 41 00:01:30,750 --> 00:01:32,610 whether or not it's a success or failure 42 00:01:32,610 --> 00:01:34,530 at the SG or NACL level. 43 00:01:34,530 --> 00:01:36,000 And we'll see this in the very next slide. 44 00:01:36,000 --> 00:01:38,100 And so this can be used, VPC flow logs, 45 00:01:38,100 --> 00:01:40,290 to do analytics on usage patterns, 46 00:01:40,290 --> 00:01:43,950 or detect managed behavior, port scans, and so on. 47 00:01:43,950 --> 00:01:44,910 Now, to query these flow logs, 48 00:01:44,910 --> 00:01:46,290 you have two ways of doing it. 49 00:01:46,290 --> 00:01:48,600 The best way is to do Athena on S3. 50 00:01:48,600 --> 00:01:50,670 Or if you wanted to do a streaming analysis, 51 00:01:50,670 --> 00:01:52,680 you could use CloudWatch Logs Insights. 52 00:01:52,680 --> 00:01:54,810 So here are some examples that you can have a look 53 00:01:54,810 --> 00:01:56,610 on your own time for flow logs. 54 00:01:56,610 --> 00:01:57,900 So, how do we use flow logs 55 00:01:57,900 --> 00:02:00,870 to troubleshoot Security Group and NACL issues? 56 00:02:00,870 --> 00:02:02,190 Well, we look at the Action field. 57 00:02:02,190 --> 00:02:05,220 So let's have a look at the typical incoming request 58 00:02:05,220 --> 00:02:06,330 for your NACL and your subnet. 59 00:02:06,330 --> 00:02:08,250 So remember, your NACLs are stateless 60 00:02:08,250 --> 00:02:10,800 and your Security Groups are stateful. 61 00:02:10,800 --> 00:02:11,940 So, what happens? 62 00:02:11,940 --> 00:02:13,740 Well, if we get an inbound reject, 63 00:02:13,740 --> 00:02:16,110 so we see that the inbound request 64 00:02:16,110 --> 00:02:19,380 from the outside to our EC2 instance is rejected, 65 00:02:19,380 --> 00:02:21,390 that can mean that either, from this graph, 66 00:02:21,390 --> 00:02:23,670 the NACL is refusing the request 67 00:02:23,670 --> 00:02:26,550 or the Security Group is refusing the request. 68 00:02:26,550 --> 00:02:28,110 And this makes sense, right? 69 00:02:28,110 --> 00:02:32,040 But if we have a inbound accept and an outbound reject, 70 00:02:32,040 --> 00:02:34,170 it can only mean a NACL issue. 71 00:02:34,170 --> 00:02:36,690 Why? Well, because your security group is stateful, 72 00:02:36,690 --> 00:02:39,540 and if the inbound is allowed because of the accept, 73 00:02:39,540 --> 00:02:41,880 automatically the outbound will be allowed 74 00:02:41,880 --> 00:02:44,760 thanks to the statefulness of your Security Group. 75 00:02:44,760 --> 00:02:46,950 So for outgoing requests, similar analysis, right? 76 00:02:46,950 --> 00:02:48,690 This is the diagram we know already. 77 00:02:48,690 --> 00:02:50,700 And so if you got an outbound reject, 78 00:02:50,700 --> 00:02:53,280 then you have a NACL or a Security Group issue. 79 00:02:53,280 --> 00:02:55,890 But if you got an outbound accepted and an inbound reject, 80 00:02:55,890 --> 00:02:58,530 then it must mean a NACL issue. 81 00:02:58,530 --> 00:03:00,600 So let's have a look at a few architectures 82 00:03:00,600 --> 00:03:02,160 for your VPC flow logs. 83 00:03:02,160 --> 00:03:04,860 So they can flow into CloudWatch Logs, as we know. 84 00:03:04,860 --> 00:03:06,090 And then we can use something called 85 00:03:06,090 --> 00:03:08,340 CloudWatch Contributor Insights, for example, 86 00:03:08,340 --> 00:03:10,620 to find the top 10 IP addresses 87 00:03:10,620 --> 00:03:12,900 that contribute to the most amount of network 88 00:03:12,900 --> 00:03:17,190 on your VPC, or on ENI, or whatever you want. 89 00:03:17,190 --> 00:03:18,510 You can also use VPC flow logs 90 00:03:18,510 --> 00:03:20,550 to send them again into CloudWatch Logs. 91 00:03:20,550 --> 00:03:22,650 Here, we can set up a metric filter 92 00:03:22,650 --> 00:03:26,460 to look, for example, for the SSH or the RDP protocols. 93 00:03:26,460 --> 00:03:28,610 And if we realize that there's a lot more SSH 94 00:03:28,610 --> 00:03:31,740 or RDP than usual, then trigger a CloudWatch alarm, 95 00:03:31,740 --> 00:03:34,710 and send an alert into an Amazon SNS topic 96 00:03:34,710 --> 00:03:37,980 because something fishy may be happening on your network. 97 00:03:37,980 --> 00:03:40,260 Or we can use a VPC flow log. 98 00:03:40,260 --> 00:03:43,890 And then we send everything into an S3 bucket for storage. 99 00:03:43,890 --> 00:03:46,230 And we use Amazon Athena to analyze 100 00:03:46,230 --> 00:03:48,450 the VPC flow logs with SQL. 101 00:03:48,450 --> 00:03:51,840 And we can even visualize that with Amazon QuickSight. 102 00:03:51,840 --> 00:03:53,760 So that's it for VPC flow logs. 103 00:03:53,760 --> 00:03:54,840 I hope you liked it. 104 00:03:54,840 --> 00:03:56,790 And I will see you in the next lecture.