1
00:00:00,090 --> 00:00:01,500
So now let's talk about

2
00:00:01,500 --> 00:00:03,080
solution architecture,

3
00:00:03,080 --> 00:00:05,939
to see how we can make an EC2 instance

4
00:00:05,939 --> 00:00:07,920
become highly available.

5
00:00:07,920 --> 00:00:09,630
Because we know that an EC2 instance,

6
00:00:09,630 --> 00:00:12,780
by default, it's launched in one Availability Zone,

7
00:00:12,780 --> 00:00:14,250
and it's not really highly available,

8
00:00:14,250 --> 00:00:17,270
but we can engineer something to make it highly available,

9
00:00:17,270 --> 00:00:19,370
and that is the whole purpose of this lecture.

10
00:00:19,370 --> 00:00:21,320
We'll see there're different ways of doing things,

11
00:00:21,320 --> 00:00:22,860
and it all depends on your requirements

12
00:00:22,860 --> 00:00:25,090
and the amount of work you wanna do.

13
00:00:25,090 --> 00:00:27,880
So, let's say we have a Public EC2 instance

14
00:00:27,880 --> 00:00:29,410
that's running a web server,

15
00:00:29,410 --> 00:00:31,700
and we wanna be able to access the web server,

16
00:00:31,700 --> 00:00:34,250
so what we'll do is that we'll attach an Elastic IP

17
00:00:34,250 --> 00:00:35,670
to that EC2 instance,

18
00:00:35,670 --> 00:00:38,020
and so our users can access our website

19
00:00:38,020 --> 00:00:39,950
directly through this Elastic IP,

20
00:00:39,950 --> 00:00:42,160
and they will be directly talking to the EC2 instance

21
00:00:42,160 --> 00:00:43,360
thanks to it,

22
00:00:43,360 --> 00:00:46,640
and we get a result from our web server, so this is great.

23
00:00:46,640 --> 00:00:48,200
But now, what we want to do

24
00:00:48,200 --> 00:00:50,600
is have a Standby EC2 instance,

25
00:00:50,600 --> 00:00:52,820
just in case things go wrong,

26
00:00:52,820 --> 00:00:55,500
that makes our EC2 instance highly available.

27
00:00:55,500 --> 00:00:57,910
Now, we need to be able to failover

28
00:00:57,910 --> 00:01:01,270
to our Standby EC2 instance, in case something goes wrong.

29
00:01:01,270 --> 00:01:03,220
So how do we know if something goes wrong?

30
00:01:03,220 --> 00:01:04,720
Well, you should think that anytime

31
00:01:04,720 --> 00:01:07,210
you wanted to know that something is about to go wrong,

32
00:01:07,210 --> 00:01:09,350
there must be some kind of monitoring in place.

33
00:01:09,350 --> 00:01:12,070
So, we're going to create a CloudWatch Event

34
00:01:12,070 --> 00:01:14,730
or a CloudWatch Alarm, based on an event we know.

35
00:01:14,730 --> 00:01:16,850
For example, if we have a CloudWatch Event,

36
00:01:16,850 --> 00:01:20,280
maybe we want to see if an instance is getting terminated.

37
00:01:20,280 --> 00:01:22,760
Or if we are having a web server,

38
00:01:22,760 --> 00:01:25,510
and we know the CPU can go all the way to 100%,

39
00:01:25,510 --> 00:01:27,360
maybe you want to have a CloudWatch Alarm

40
00:01:27,360 --> 00:01:28,840
that monitors the CPU,

41
00:01:28,840 --> 00:01:30,890
and if we see the CPU is at 100%,

42
00:01:30,890 --> 00:01:33,080
maybe the EC2 instance has gone wrong,

43
00:01:33,080 --> 00:01:35,340
and we want to trigger an alarm based on that.

44
00:01:35,340 --> 00:01:37,790
So there's different ways of monitoring your EC2 instance,

45
00:01:37,790 --> 00:01:40,060
based on what your requirements may be.

46
00:01:40,060 --> 00:01:43,300
Then, from the Alarm or the CloudWatch Events,

47
00:01:43,300 --> 00:01:46,470
you could go ahead and trigger a Lambda function.

48
00:01:46,470 --> 00:01:47,720
And that Lambda function,

49
00:01:47,720 --> 00:01:50,270
will allow you to do whatever you want,

50
00:01:50,270 --> 00:01:51,120
and that lambda function,

51
00:01:51,120 --> 00:01:53,200
for example, can issue API calls

52
00:01:53,200 --> 00:01:56,070
to start the instance if it hasn't been started yet, okay,

53
00:01:56,070 --> 00:01:57,830
if there's no Standby EC2 instance.

54
00:01:57,830 --> 00:02:00,120
And then, issue an API call

55
00:02:00,120 --> 00:02:04,140
to attach the Elastic IP to my Standby instance.

56
00:02:04,140 --> 00:02:06,530
So now the Elastic IP will be attached,

57
00:02:06,530 --> 00:02:08,620
and it will be obviously detached

58
00:02:08,620 --> 00:02:09,919
from the other EC2 instance,

59
00:02:09,919 --> 00:02:11,580
because an Elastic IP can only be attached

60
00:02:11,580 --> 00:02:13,250
to one instance at a time,

61
00:02:13,250 --> 00:02:16,510
and the other EC2 instance, can be terminated or disappear,

62
00:02:16,510 --> 00:02:18,900
and we have effectively failed over

63
00:02:18,900 --> 00:02:21,120
to a new Standby EC2 instance.

64
00:02:21,120 --> 00:02:24,430
But our users because they communicate to our architecture,

65
00:02:24,430 --> 00:02:25,990
thanks to the Elastic IP,

66
00:02:25,990 --> 00:02:27,440
they don't really see anything happening,

67
00:02:27,440 --> 00:02:28,520
it's all in the back end.

68
00:02:28,520 --> 00:02:29,670
And so that's one way,

69
00:02:29,670 --> 00:02:32,330
of creating a highly available EC2 instance.

70
00:02:32,330 --> 00:02:33,690
But there are more ways.

71
00:02:33,690 --> 00:02:35,820
Okay, let's talk about a second way of doing it,

72
00:02:35,820 --> 00:02:37,570
with an Auto Scaling Group.

73
00:02:37,570 --> 00:02:41,440
So, we have an ASG in two Availability Zones,

74
00:02:41,440 --> 00:02:43,300
and again, we're using the same concept,

75
00:02:43,300 --> 00:02:45,950
where a user is going to be talking to our application

76
00:02:45,950 --> 00:02:47,930
using an Elastic IP because it makes things

77
00:02:47,930 --> 00:02:49,450
a little bit simpler.

78
00:02:49,450 --> 00:02:52,360
So now how should we configure our Auto Scaling Group?

79
00:02:52,360 --> 00:02:54,290
What if we configure it this way,

80
00:02:54,290 --> 00:02:56,870
we say the minimum amount of instances is one,

81
00:02:56,870 --> 00:02:59,690
the maximum is one, and we want one desired,

82
00:02:59,690 --> 00:03:02,860
and we specify over two Availability Zones.

83
00:03:02,860 --> 00:03:04,480
So, what does it mean?

84
00:03:04,480 --> 00:03:06,970
That means we're going to get only one EC2 instance,

85
00:03:06,970 --> 00:03:09,980
and that EC2 instance may be in the first AZ.

86
00:03:09,980 --> 00:03:11,730
And that's what we get out of these settings.

87
00:03:11,730 --> 00:03:13,490
So why would we use these settings?

88
00:03:13,490 --> 00:03:15,650
Well, for example, we can say that

89
00:03:15,650 --> 00:03:17,890
on the user data of the EC2 instance,

90
00:03:17,890 --> 00:03:19,120
when it does come up,

91
00:03:19,120 --> 00:03:21,580
its going to acquire and attach

92
00:03:21,580 --> 00:03:24,700
this Elastic IP address based on Tags.

93
00:03:24,700 --> 00:03:27,210
So this user data will issue API calls

94
00:03:27,210 --> 00:03:29,800
and the Elastic IP will be attached

95
00:03:29,800 --> 00:03:31,350
to our Public EC2 instance,

96
00:03:31,350 --> 00:03:35,300
and our users will be able to talk to our web server.

97
00:03:35,300 --> 00:03:37,870
But now, let's discuss that this instance

98
00:03:37,870 --> 00:03:39,730
is being terminated, it goes down,

99
00:03:39,730 --> 00:03:41,430
and so what the ASG will do,

100
00:03:41,430 --> 00:03:43,640
is that it will terminate the first instance

101
00:03:43,640 --> 00:03:47,540
and create a Replacement EC2 instance in another AZ,

102
00:03:47,540 --> 00:03:48,910
and thanks to that,

103
00:03:48,910 --> 00:03:51,620
what we get is that, the first instance is terminated,

104
00:03:51,620 --> 00:03:55,060
and the second instance will run it's EC2 user data scripts

105
00:03:55,060 --> 00:03:57,530
and attach the Elastic IP.

106
00:03:57,530 --> 00:04:00,110
And we have effectively failover, so in this case,

107
00:04:00,110 --> 00:04:02,590
we don't need a CloudWatch Alarm or a CloudWatch Event,

108
00:04:02,590 --> 00:04:04,330
the Auto Scaling Group as soon as it sees

109
00:04:04,330 --> 00:04:06,510
that one instance has been terminated,

110
00:04:06,510 --> 00:04:07,790
thanks to its settings,

111
00:04:07,790 --> 00:04:10,980
will create a new EC2 instance and another AZ.

112
00:04:10,980 --> 00:04:13,340
And the reason we have one mix, one max and one desired

113
00:04:13,340 --> 00:04:15,310
is that we'll never get more than one instance

114
00:04:15,310 --> 00:04:16,660
running at the same time

115
00:04:16,660 --> 00:04:19,720
in our entire ASG, which is great.

116
00:04:19,720 --> 00:04:21,990
Finally, because our EC2 instance

117
00:04:21,990 --> 00:04:24,340
does do API calls directly,

118
00:04:24,340 --> 00:04:26,580
to attach this Elastic IP Address,

119
00:04:26,580 --> 00:04:28,950
then we need to make sure that the EC2 instance

120
00:04:28,950 --> 00:04:32,610
has an instance role, that allows it to issue API calls

121
00:04:32,610 --> 00:04:34,690
to attach this Elastic IP Address.

122
00:04:34,690 --> 00:04:38,100
So here we have a combo of using EC2 User Data

123
00:04:38,100 --> 00:04:40,070
to attach the Elastic IP Address,

124
00:04:40,070 --> 00:04:42,260
and also having an EC2 instance role

125
00:04:42,260 --> 00:04:45,910
to make sure the API call will succeed.

126
00:04:45,910 --> 00:04:48,560
So we can extend this pattern to another thing.

127
00:04:48,560 --> 00:04:50,410
For example, our EC2 instance,

128
00:04:50,410 --> 00:04:52,470
can be stateful and have an EBS volume,

129
00:04:52,470 --> 00:04:53,890
so we can get even more complicated,

130
00:04:53,890 --> 00:04:55,650
so let's get started with it,

131
00:04:55,650 --> 00:04:56,723
so we have an Auto Scaling Group,

132
00:04:56,723 --> 00:04:58,770
two AZ, our Public EC2 instance,

133
00:04:58,770 --> 00:05:00,990
and an Elastic IP, so we already know this.

134
00:05:00,990 --> 00:05:03,070
But now, we also have an EBS Volume

135
00:05:03,070 --> 00:05:05,510
attached to our EC2 instance,

136
00:05:05,510 --> 00:05:06,500
let's imagine for example,

137
00:05:06,500 --> 00:05:07,980
that EC2 instance is a database,

138
00:05:07,980 --> 00:05:10,650
and we're trying to make that database highly available.

139
00:05:10,650 --> 00:05:13,260
So, all of our data is onto our EBS Volume,

140
00:05:13,260 --> 00:05:14,910
and we know an EBS Volume

141
00:05:14,910 --> 00:05:17,960
is locked into a specific Availability Zone.

142
00:05:17,960 --> 00:05:22,220
So let's imagine that our EC2 instance is being terminated,

143
00:05:22,220 --> 00:05:23,900
and now what should we do?

144
00:05:23,900 --> 00:05:25,780
Well, we know that on termination,

145
00:05:25,780 --> 00:05:28,700
the Auto Scaling Group can use lifecycle hooks,

146
00:05:28,700 --> 00:05:30,630
and thanks to this lifecycle hook,

147
00:05:30,630 --> 00:05:34,760
we can create a script to take that EBS Volume

148
00:05:34,760 --> 00:05:37,040
and create an EBS Snapshot from it.

149
00:05:37,040 --> 00:05:38,390
Because, it will be triggered

150
00:05:38,390 --> 00:05:40,240
as soon as the EC2 instance goes down,

151
00:05:40,240 --> 00:05:43,160
and so we know that the EBS volume will be frayed.

152
00:05:43,160 --> 00:05:44,850
So we have an EBS Snapshot,

153
00:05:44,850 --> 00:05:46,460
and we tag it properly,

154
00:05:46,460 --> 00:05:49,440
and the ASG will be launching a Replacement EC2 instance,

155
00:05:49,440 --> 00:05:51,090
we have the same settings as before,

156
00:05:51,090 --> 00:05:53,750
and now, by properly configuring again

157
00:05:53,750 --> 00:05:54,960
our Auto Scaling Group

158
00:05:54,960 --> 00:05:58,050
to create a lifecycle hook on the Launch event,

159
00:05:58,050 --> 00:06:02,040
then we can create an EBS Volume out of this EBS Snapshot

160
00:06:02,040 --> 00:06:03,910
into the correct Availability Zone,

161
00:06:03,910 --> 00:06:07,470
and then attach it to the Replacement EC2 instance.

162
00:06:07,470 --> 00:06:09,910
And then the EC2 user now can just check that

163
00:06:09,910 --> 00:06:13,520
and also attach the Elastic IP address directly,

164
00:06:13,520 --> 00:06:16,230
and we need to make sure obviously the API calls are correct

165
00:06:16,230 --> 00:06:18,600
so we need to have an EC2 instance role,

166
00:06:18,600 --> 00:06:19,650
which as we can see here,

167
00:06:19,650 --> 00:06:22,360
we've done a combo of EC2 user data,

168
00:06:22,360 --> 00:06:24,340
and also lifecycle hooks

169
00:06:24,340 --> 00:06:26,280
to make sure that the EBS Volume

170
00:06:26,280 --> 00:06:27,820
was first getting Snapshots

171
00:06:27,820 --> 00:06:29,610
and then being restored from the Snapshot

172
00:06:29,610 --> 00:06:30,980
into a different AZ.

173
00:06:30,980 --> 00:06:33,900
And that makes it a highly available EC2 instance,

174
00:06:33,900 --> 00:06:35,390
with an EBS volume.

175
00:06:35,390 --> 00:06:37,280
So as we can see the possibilities are endless,

176
00:06:37,280 --> 00:06:38,560
but it's good to see them once

177
00:06:38,560 --> 00:06:40,690
to see how these kind of architectures can work,

178
00:06:40,690 --> 00:06:41,740
obviously, they're a bit more work,

179
00:06:41,740 --> 00:06:42,800
they're a bit more custom,

180
00:06:42,800 --> 00:06:45,260
but we can achieve great things with automation.

181
00:06:45,260 --> 00:06:46,950
So that's it for this lecture, I hope you liked it,

182
00:06:46,950 --> 00:06:48,900
and I will see you in the next lecture.