1 00:00:01,160 --> 00:00:02,060 Hey. 2 00:00:02,090 --> 00:00:02,690 Hey. 3 00:00:03,110 --> 00:00:04,630 Welcome to the next section. 4 00:00:04,640 --> 00:00:08,330 So this section, as you can probably gather, we're still working with our books data. 5 00:00:08,330 --> 00:00:12,050 I hope you're not tired of it yet if you are almost done. 6 00:00:12,230 --> 00:00:16,940 But it's important that we keep working with one data set for a little bit at least, so that you can 7 00:00:16,940 --> 00:00:18,020 get familiar with it. 8 00:00:18,020 --> 00:00:22,880 And the reason that that matters is that as we do some of the more advanced things, we want you to 9 00:00:22,880 --> 00:00:24,140 be able to check your work. 10 00:00:24,140 --> 00:00:29,810 So in this section, we're going to learn a lot about different ways of performing analysis on data. 11 00:00:29,810 --> 00:00:35,480 So things like finding averages or summing a bunch of data together, grouping things by authors and 12 00:00:35,480 --> 00:00:39,440 calculating average quantities or page numbers per author. 13 00:00:39,560 --> 00:00:45,620 These are operations that if we had 10,000 books or 10,000 something else, it's really hard to know 14 00:00:45,620 --> 00:00:46,910 if you're doing it right or wrong. 15 00:00:46,910 --> 00:00:50,660 You get an answer, just a number, let's say 67.5. 16 00:00:50,660 --> 00:00:52,610 And how could you know if that's right or wrong? 17 00:00:52,880 --> 00:00:58,070 But if we're working with books and we have 20 of them and you're familiar with the page counts and 18 00:00:58,070 --> 00:01:03,820 the authors, you'll know, okay, this author has three books that seems right versus okay. 19 00:01:03,830 --> 00:01:05,060 That author only has two books. 20 00:01:05,060 --> 00:01:06,080 Why are we saying that? 21 00:01:06,080 --> 00:01:08,570 You know, we have three or something like that. 22 00:01:08,570 --> 00:01:09,530 Terrible example. 23 00:01:09,530 --> 00:01:11,060 But the same idea is true. 24 00:01:11,180 --> 00:01:12,980 We know our data at this point. 25 00:01:12,980 --> 00:01:17,900 We're going to at the end of this course and along the way, we're going to keep upgrading our data 26 00:01:17,900 --> 00:01:22,340 to more complex structures, more tables, more rows, complex stuff. 27 00:01:22,610 --> 00:01:28,610 But at the very end, kind of the capstone case study will be working with Instagram esque data, fake 28 00:01:28,610 --> 00:01:34,010 data for Instagram, and we'll have thousands and thousands of rows and you won't actually know if you're 29 00:01:34,010 --> 00:01:38,660 doing things right or wrong based off the number you're getting unless you've manually checked your 30 00:01:38,660 --> 00:01:41,930 work by doing 1000 additions or something. 31 00:01:42,140 --> 00:01:45,650 So all that's to say, stick with the books if you can. 32 00:01:46,100 --> 00:01:51,080 And in this section we're focusing a lot on these new aggregate functions. 33 00:01:51,080 --> 00:01:58,100 So those are things like finding averages, counting, summing things together based off of grouping 34 00:01:58,100 --> 00:01:58,700 data. 35 00:01:58,700 --> 00:02:02,150 So it's a bit hard to explain in a headshot without showing you code. 36 00:02:02,150 --> 00:02:05,480 So I'll let the code do that in just a few videos from now. 37 00:02:05,480 --> 00:02:10,669 But the rough idea is that we take our data and there's all sorts of insights we can gain. 38 00:02:10,669 --> 00:02:17,080 So rather than just working with an individual row or a group of rows, we can combine things into like 39 00:02:17,090 --> 00:02:18,050 mega rows. 40 00:02:18,050 --> 00:02:24,410 I can combine all of our authors and group books based off of who wrote them or group books based off 41 00:02:24,410 --> 00:02:28,610 of what year they're written in, and then perform operations on those groups. 42 00:02:28,610 --> 00:02:35,960 So it allows me to do things like find the average sales that we've had per year, or we could do things 43 00:02:35,960 --> 00:02:43,190 like find the average page number for books per genre, and then we could expand that obviously to more 44 00:02:43,190 --> 00:02:44,000 complex stuff. 45 00:02:44,000 --> 00:02:48,740 If you're working with advertising data or let's say our Instagram data, we'll be able to at the end 46 00:02:48,740 --> 00:02:55,190 of the course, do things like find out which one of our users is a power user or influencers, but 47 00:02:55,190 --> 00:03:01,220 they're called, I guess, meaning that they have the most comments, the most likes on each one of 48 00:03:01,220 --> 00:03:02,450 their posts on average. 49 00:03:02,450 --> 00:03:06,800 So who in our database is getting on average the most likes and comments? 50 00:03:06,800 --> 00:03:10,520 Or we could do things like which hashtag generates the most traction. 51 00:03:10,520 --> 00:03:15,680 And to do that we would need to analyze all of our hashtags and then take all of our photos that have 52 00:03:15,680 --> 00:03:21,290 those hashtags, group them together, figure out which one of those hashtags generates the most likes. 53 00:03:21,290 --> 00:03:24,620 So there's a lot of stuff that we can do with these aggregate functions. 54 00:03:24,620 --> 00:03:29,870 They form the backbone of a lot of the questions and analysis that we'll do throughout the course. 55 00:03:29,870 --> 00:03:30,500 All right. 56 00:03:30,500 --> 00:03:31,550 I'm rambling now. 57 00:03:31,550 --> 00:03:32,600 I'm going to go away. 58 00:03:32,600 --> 00:03:34,520 Hopefully you enjoy this section. 59 00:03:34,520 --> 00:03:37,640 It's important to say that a lot, but it really is. 60 00:03:37,640 --> 00:03:42,800 And of course, we'll have a bunch of exercises throughout the course, throughout the section and especially 61 00:03:42,800 --> 00:03:43,580 at the end. 62 00:03:43,580 --> 00:03:46,280 And I'm trying to keep those interesting, look forward to those. 63 00:03:46,280 --> 00:03:50,150 If you don't just remember it's a database course. 64 00:03:50,150 --> 00:03:51,350 I'm trying my best. 65 00:03:51,680 --> 00:03:53,480 It's it's databases. 66 00:03:53,720 --> 00:03:55,190 All right, I'm done.