1
00:00:00,450 --> 00:00:03,090
‫Now let's discuss Kinesis Client Library.

2
00:00:03,090 --> 00:00:05,140
‫So it is something that the exam can ask you

3
00:00:05,140 --> 00:00:06,410
‫one scenario question about,

4
00:00:06,410 --> 00:00:09,010
‫so let's go over how it works in the scenario question.

5
00:00:09,010 --> 00:00:12,070
‫So this is a Java library that helps you read records

6
00:00:12,070 --> 00:00:15,100
‫from Kinesis Data Stream with distributed applications

7
00:00:15,100 --> 00:00:17,830
‫that we'll be sharing the read workload.

8
00:00:17,830 --> 00:00:21,280
‫And each shard is to be read by only KCL instance,

9
00:00:21,280 --> 00:00:22,970
‫that means that if you have 4 shards,

10
00:00:22,970 --> 00:00:26,260
‫you get a maximum of 4 KCL instances.

11
00:00:26,260 --> 00:00:29,820
‫If you have 6 shards, you get a maximum of 6 KCL instances.

12
00:00:29,820 --> 00:00:32,090
‫And if I just say, you're good to go for the exam,

13
00:00:32,090 --> 00:00:33,980
‫but I want you to explain exactly how it works,

14
00:00:33,980 --> 00:00:35,480
‫so you can get an idea about

15
00:00:35,480 --> 00:00:37,140
‫how the Kinesis Client Library works.

16
00:00:37,140 --> 00:00:39,810
‫So, the Kinesis Client Library will be reading

17
00:00:39,810 --> 00:00:41,870
‫from our Kinesis Data Stream

18
00:00:41,870 --> 00:00:44,280
‫and the progress of how far it's been reading

19
00:00:44,280 --> 00:00:46,820
‫is going to be checkpointed into DynamoDB,

20
00:00:46,820 --> 00:00:49,400
‫and so your application running KCL

21
00:00:49,400 --> 00:00:51,960
‫will need IAM access to DynamoDB.

22
00:00:51,960 --> 00:00:53,680
‫It will be able, thanks to DynamoDB

23
00:00:53,680 --> 00:00:56,670
‫to track the other workers of your KCL application

24
00:00:56,670 --> 00:00:59,180
‫and share the work among shards.

25
00:00:59,180 --> 00:01:01,100
‫KCL can run on anything you want

26
00:01:01,100 --> 00:01:03,480
‫but you can be running on EC2 instances,

27
00:01:03,480 --> 00:01:05,100
‫with an EC2 instance role,

28
00:01:05,100 --> 00:01:07,050
‫you're Elastic Beanstalk application,

29
00:01:07,050 --> 00:01:08,790
‫or on-premises servers,

30
00:01:08,790 --> 00:01:11,680
‫as long as they have correct IAM credentials.

31
00:01:11,680 --> 00:01:13,000
‫The records are going to be read in order

32
00:01:13,000 --> 00:01:15,260
‫and at the shard level obviously,

33
00:01:15,260 --> 00:01:18,020
‫and there are two versions of the Kinesis Client Library,

34
00:01:18,020 --> 00:01:20,650
‫Version 1, supports only shared consumer

35
00:01:20,650 --> 00:01:22,640
‫and version two of KCL,

36
00:01:22,640 --> 00:01:26,370
‫supports both shared and enhance fan-out consumer remotes.

37
00:01:26,370 --> 00:01:31,060
‫So, if we look at an example of 4 shards into our stream,

38
00:01:31,060 --> 00:01:34,520
‫we can have a DynamoDB table to check on the progress,

39
00:01:34,520 --> 00:01:36,840
‫and so we can run two KCL apps

40
00:01:36,840 --> 00:01:39,260
‫of the same coherent application

41
00:01:39,260 --> 00:01:42,630
‫running on two different EC2 instances.

42
00:01:42,630 --> 00:01:44,980
‫in this case, thanks to DynamoDB,

43
00:01:44,980 --> 00:01:46,710
‫they will know how to share the work,

44
00:01:46,710 --> 00:01:48,970
‫so the first KCl app is going to be reading

45
00:01:48,970 --> 00:01:49,987
‫from shard 1 and 2,

46
00:01:49,987 --> 00:01:52,040
‫and the second KCL app is going to be reading

47
00:01:52,040 --> 00:01:53,840
‫from shard 3 and 4.

48
00:01:53,840 --> 00:01:55,700
‫Now, the progress of how far

49
00:01:55,700 --> 00:01:58,170
‫the app has been reading into the Kinesis Data Stream

50
00:01:58,170 --> 00:02:00,550
‫will be checkpointed into DynamoDB.

51
00:02:00,550 --> 00:02:03,590
‫And so, for example, if one of these application goes down,

52
00:02:03,590 --> 00:02:06,870
‫DynamoDB and KCL apps working together,

53
00:02:06,870 --> 00:02:08,380
‫will know that an app will go down,

54
00:02:08,380 --> 00:02:10,950
‫and so reading from the other shards will be resumed

55
00:02:10,950 --> 00:02:12,623
‫from where it was checkpointed.

56
00:02:13,620 --> 00:02:15,090
‫It works also when you scale up,

57
00:02:15,090 --> 00:02:16,400
‫so if you have 4 shards

58
00:02:16,400 --> 00:02:19,250
‫and now you run 4 KCL applications,

59
00:02:19,250 --> 00:02:21,910
‫then it will be each reading from one shard.

60
00:02:21,910 --> 00:02:24,427
‫And therefore the progress will be resumed from DynamoDB

61
00:02:24,427 --> 00:02:25,610
‫and checkpointed again.

62
00:02:25,610 --> 00:02:27,310
‫So as you can see how this works, right?

63
00:02:27,310 --> 00:02:29,800
‫But we can not have more KCL apps than shards,

64
00:02:29,800 --> 00:02:32,980
‫because well, otherwise one will be doing nothing.

65
00:02:32,980 --> 00:02:35,070
‫So if you want to read to scale Kinesis,

66
00:02:35,070 --> 00:02:37,800
‫you can scale Kinesis and add 6 shards,

67
00:02:37,800 --> 00:02:39,930
‫so now we still have our 4 KCL applications,

68
00:02:39,930 --> 00:02:42,830
‫but now we have six shards in Kinesis in the streams.

69
00:02:42,830 --> 00:02:45,450
‫And so again, they will detect this change,

70
00:02:45,450 --> 00:02:48,190
‫and working together with DynamoDB,

71
00:02:48,190 --> 00:02:50,010
‫they will again, split the work

72
00:02:50,010 --> 00:02:54,430
‫between each KCL application and the shard assignments.

73
00:02:54,430 --> 00:02:56,420
‫So that means that once we have 6 shards

74
00:02:56,420 --> 00:02:57,570
‫Kinesis Data Stream

75
00:02:57,570 --> 00:03:01,580
‫then we can have 6 KCL applications reading from them,

76
00:03:01,580 --> 00:03:04,040
‫and checkpointing the progress into DynamoDB.

77
00:03:04,040 --> 00:03:05,000
‫If you've understood that,

78
00:03:05,000 --> 00:03:06,880
‫then you will be good to go for the exam

79
00:03:06,880 --> 00:03:08,400
‫to answer the question.

80
00:03:08,400 --> 00:03:10,560
‫That's it for this lecture, I hope you liked it,

81
00:03:10,560 --> 00:03:12,510
‫and I will see you in the next lecture.