1
00:00:00,270 --> 00:00:07,410
It's much more efficient and much more meaningful to create using Splunk icons for this exercise, I

2
00:00:07,410 --> 00:00:14,910
have created considering three scenarios like small enterprise, medium enterprise and large enterprise.

3
00:00:14,910 --> 00:00:20,160
And the last one is the crazy one which involves high availability and clustering architecture.

4
00:00:20,190 --> 00:00:22,620
We'll go through them one by one.

5
00:00:25,870 --> 00:00:28,060
Before going to that.

6
00:00:28,060 --> 00:00:30,460
We are a few more things to sort it out.

7
00:00:30,520 --> 00:00:32,290
Let's learn those things.

8
00:00:33,140 --> 00:00:38,300
That is the of licence calculation, which is one of the.

9
00:00:41,320 --> 00:00:44,530
Crucial things in designing any architecture.

10
00:00:47,980 --> 00:00:51,520
The crucial step of any Splunk implementation.

11
00:00:51,520 --> 00:00:54,430
When I say any Splunk, it can be small.

12
00:00:54,430 --> 00:01:00,310
Medium or large enterprise is to estimate how much license you need.

13
00:01:01,210 --> 00:01:06,880
This is by far the difficult step in designing the architecture because there is no straight answer

14
00:01:06,880 --> 00:01:09,370
saying I need 100 GB, 200 GB.

15
00:01:12,300 --> 00:01:17,460
That can never be a straight answer for how much data we are estimating from data sources because as

16
00:01:17,460 --> 00:01:23,160
we all know, in some scenarios there will be a log spike because of any error or application crashing.

17
00:01:23,340 --> 00:01:28,200
We'll see how we can best estimate our log size in our environment.

18
00:01:28,980 --> 00:01:36,630
This step as a Splunk admin or an architect needs you to interact with other teams and ask them What

19
00:01:36,630 --> 00:01:39,930
is the log size or the data size for yesterday?

20
00:01:40,890 --> 00:01:42,870
If they provide well and good.

21
00:01:42,900 --> 00:01:46,800
Next, ask them how many devices should be integrated.

22
00:01:48,040 --> 00:01:50,950
With your Splunk, you will get a rough estimate.

23
00:01:50,980 --> 00:01:52,160
Keep that number.

24
00:01:52,180 --> 00:01:53,290
It's not over yet.

25
00:01:53,320 --> 00:01:54,670
You got from one team.

26
00:01:54,700 --> 00:02:02,380
Repeat the same step with other teams in the organization like Network for the Syslog Inputs and flash

27
00:02:02,380 --> 00:02:09,670
files for either system, team or server teams for their data and even database team.

28
00:02:10,210 --> 00:02:19,710
After adding up all the numbers, let's say you come to a conclusion of 100 GB, 100 GB per day data.

29
00:02:19,720 --> 00:02:27,340
But based on my experience, it's better to not to go for exact trigger of what we have calculated.

30
00:02:27,580 --> 00:02:35,560
It's good to take 10 to 20% buffer so that any spike in logs should be manageable and should be well

31
00:02:35,560 --> 00:02:36,790
under our limit.

32
00:02:37,420 --> 00:02:45,130
Now to conclude, after discussing and agreeing with all the teams, we can come to a rough estimate

33
00:02:45,130 --> 00:02:50,620
of probably 120 GB of data, including our buffer.