1 00:00:03,580 --> 00:00:10,660 Now, since we have all the required factors, let's proceed to have a look of how different architecture 2 00:00:10,660 --> 00:00:13,080 looks like for demonstration purpose. 3 00:00:13,090 --> 00:00:20,170 These are purely design based on my experience on Splunk for after working for close to five years in 4 00:00:20,170 --> 00:00:27,880 these tutorials, I made three scenarios of architecture that is small, medium, large enterprise and 5 00:00:27,880 --> 00:00:35,320 the one more the crazy one I call it, as it's the large enterprise with high availability and clustering 6 00:00:35,830 --> 00:00:36,490 mention. 7 00:00:37,330 --> 00:00:40,090 Let's go through them one by one. 8 00:00:40,090 --> 00:00:47,770 The first one, as we see in our screen, it is the small architecture which can be compared to organization 9 00:00:47,770 --> 00:00:51,940 with a license limit, which is less than 100 GB per day. 10 00:00:51,970 --> 00:00:59,620 As you can see in this picture, we have one indexer, one searcher and a couple of people using it, 11 00:00:59,620 --> 00:01:03,220 and we have lots of data forwarded from many different sources. 12 00:01:04,250 --> 00:01:07,940 This is our typical small enterprise architecture. 13 00:01:07,970 --> 00:01:08,840 Looks like. 14 00:01:10,480 --> 00:01:15,790 Yet even the search can be optional if you have just one or two users. 15 00:01:16,210 --> 00:01:17,500 All you can do is one. 16 00:01:17,500 --> 00:01:21,460 Indexer should be able to handle the two one or two users load. 17 00:01:21,550 --> 00:01:31,530 You can deploy everything as a single standalone instance, but always keep in your mind in a big organization, 18 00:01:31,540 --> 00:01:34,990 even if the license is 100 GB per day. 19 00:01:35,500 --> 00:01:39,310 That's good to start deploying them in distributed mode. 20 00:01:39,310 --> 00:01:47,380 And as a distributed mode, it means that all the Splunk components are of dedicated rules. 21 00:01:47,380 --> 00:01:48,070 That means search. 22 00:01:48,070 --> 00:01:52,710 It is separated, indexer is separated, forwarders are separated, every forwarder. 23 00:01:52,720 --> 00:01:57,840 So this is how each dedicated rules are configured. 24 00:01:57,850 --> 00:02:01,330 So we call it a distributed mode of deployment. 25 00:02:02,560 --> 00:02:11,890 I say this because as organization, let's say a big corporation at a ten G, it has purchased ten GB 26 00:02:11,890 --> 00:02:14,770 license for just monitoring their logs. 27 00:02:16,950 --> 00:02:25,170 My experience, I was seeing like once the organization realized the value of Splunk and the variety 28 00:02:25,170 --> 00:02:33,750 of data it can process, it can scale from ten GB to several hundreds of GB in a matter of months. 29 00:02:33,750 --> 00:02:40,770 So as an architecture, you should be aware of what level of data you have in your organisation and 30 00:02:40,770 --> 00:02:42,810 how big is your organisation. 31 00:02:42,810 --> 00:02:48,990 So that you should always think one step ahead so that you cover all these scenarios and you should 32 00:02:48,990 --> 00:02:57,240 be ready that in case if I expand from 10 to 200 gigs, I need to add more indexes, more searchers, 33 00:02:57,300 --> 00:03:01,950 so it's better to deploy from get going in a distributed mode. 34 00:03:03,530 --> 00:03:09,710 And as we have discussed, it can be scaled up from a single instance setup to distributed setup where 35 00:03:09,710 --> 00:03:17,050 each component will be responsible for specific task at any point of Splunk installation or operations 36 00:03:17,060 --> 00:03:17,600 phase. 37 00:03:18,560 --> 00:03:23,490 In previous videos, we have gone through saying that you can add your search at any time. 38 00:03:23,510 --> 00:03:25,280 You can add your indexer at any time. 39 00:03:25,280 --> 00:03:31,610 There is no impact, no data loss or no operational disruption, because that is how Splunk has been 40 00:03:31,610 --> 00:03:32,070 designed. 41 00:03:32,090 --> 00:03:34,910 It is easy to scale horizontally and vertically. 42 00:03:34,910 --> 00:03:40,970 You can add resources to one machine, you can add additional components to one architecture, but you 43 00:03:40,970 --> 00:03:42,290 still have all the data. 44 00:03:42,290 --> 00:03:44,870 You can search, you can report, you can do everything. 45 00:03:45,770 --> 00:03:49,640 But it's always a good practice to start from a good foundation. 46 00:03:52,020 --> 00:03:53,790 Moving to the architecture. 47 00:03:56,130 --> 00:03:57,510 Let me open up. 48 00:03:59,530 --> 00:04:02,800 The designs which I have created for architecture. 49 00:04:06,830 --> 00:04:10,280 So here, this is the architecture. 50 00:04:11,730 --> 00:04:18,030 Which we have created for small enterprise, which has less than 100 GB of data we can see via only 51 00:04:18,030 --> 00:04:19,140 one indexer. 52 00:04:20,290 --> 00:04:27,790 One searcher and many universal forwarder which are covered as a block because it will be looking ugly 53 00:04:27,790 --> 00:04:32,650 if I draw lines from all these end points to pointing to one indexes. 54 00:04:32,650 --> 00:04:36,550 So I made it a container and sending it as a single. 55 00:04:37,570 --> 00:04:39,850 The final motto is simple. 56 00:04:39,880 --> 00:04:46,120 All the logs which are collected by agents or syslog devices will be sent to the indexer. 57 00:04:48,830 --> 00:04:51,990 This is a typical small Splunk architecture. 58 00:04:52,010 --> 00:05:02,030 Looks like the data sources can be off syslog firewalls, universal forwarders, Apple, Solaris, Linux, 59 00:05:02,030 --> 00:05:04,460 Windows, or even scripted inputs. 60 00:05:04,940 --> 00:05:12,110 We'll be sending logs to our indexes and the search engine if you come to the data flow. 61 00:05:12,320 --> 00:05:17,480 The logs have been collected from the universal forwarder and it has been sent to index where it is 62 00:05:17,480 --> 00:05:23,330 passed and broken down into pieces and then stored into our storage. 63 00:05:23,600 --> 00:05:28,430 The indexer holds 100% of the data here search. 64 00:05:28,460 --> 00:05:33,890 It is the one which queries the indexer based on the searches we write or the alerts or the reports, 65 00:05:33,890 --> 00:05:42,380 and this query is run on the storage and indexer fetches the results and gives it to search it for visualization 66 00:05:42,380 --> 00:05:44,270 or alerting purposes.