1 00:00:00,270 --> 00:00:07,410 It's much more efficient and much more meaningful to create using Splunk icons for this exercise, I 2 00:00:07,410 --> 00:00:14,910 have created considering three scenarios like small enterprise, medium enterprise and large enterprise. 3 00:00:14,910 --> 00:00:20,160 And the last one is the crazy one which involves high availability and clustering architecture. 4 00:00:20,190 --> 00:00:22,620 We'll go through them one by one. 5 00:00:25,870 --> 00:00:28,060 Before going to that. 6 00:00:28,060 --> 00:00:30,460 We are a few more things to sort it out. 7 00:00:30,520 --> 00:00:32,290 Let's learn those things. 8 00:00:33,140 --> 00:00:38,300 That is the of licence calculation, which is one of the. 9 00:00:41,320 --> 00:00:44,530 Crucial things in designing any architecture. 10 00:00:47,980 --> 00:00:51,520 The crucial step of any Splunk implementation. 11 00:00:51,520 --> 00:00:54,430 When I say any Splunk, it can be small. 12 00:00:54,430 --> 00:01:00,310 Medium or large enterprise is to estimate how much license you need. 13 00:01:01,210 --> 00:01:06,880 This is by far the difficult step in designing the architecture because there is no straight answer 14 00:01:06,880 --> 00:01:09,370 saying I need 100 GB, 200 GB. 15 00:01:12,300 --> 00:01:17,460 That can never be a straight answer for how much data we are estimating from data sources because as 16 00:01:17,460 --> 00:01:23,160 we all know, in some scenarios there will be a log spike because of any error or application crashing. 17 00:01:23,340 --> 00:01:28,200 We'll see how we can best estimate our log size in our environment. 18 00:01:28,980 --> 00:01:36,630 This step as a Splunk admin or an architect needs you to interact with other teams and ask them What 19 00:01:36,630 --> 00:01:39,930 is the log size or the data size for yesterday? 20 00:01:40,890 --> 00:01:42,870 If they provide well and good. 21 00:01:42,900 --> 00:01:46,800 Next, ask them how many devices should be integrated. 22 00:01:48,040 --> 00:01:50,950 With your Splunk, you will get a rough estimate. 23 00:01:50,980 --> 00:01:52,160 Keep that number. 24 00:01:52,180 --> 00:01:53,290 It's not over yet. 25 00:01:53,320 --> 00:01:54,670 You got from one team. 26 00:01:54,700 --> 00:02:02,380 Repeat the same step with other teams in the organization like Network for the Syslog Inputs and flash 27 00:02:02,380 --> 00:02:09,670 files for either system, team or server teams for their data and even database team. 28 00:02:10,210 --> 00:02:19,710 After adding up all the numbers, let's say you come to a conclusion of 100 GB, 100 GB per day data. 29 00:02:19,720 --> 00:02:27,340 But based on my experience, it's better to not to go for exact trigger of what we have calculated. 30 00:02:27,580 --> 00:02:35,560 It's good to take 10 to 20% buffer so that any spike in logs should be manageable and should be well 31 00:02:35,560 --> 00:02:36,790 under our limit. 32 00:02:37,420 --> 00:02:45,130 Now to conclude, after discussing and agreeing with all the teams, we can come to a rough estimate 33 00:02:45,130 --> 00:02:50,620 of probably 120 GB of data, including our buffer.