1 00:00:00,300 --> 00:00:00,780 All right. 2 00:00:00,780 --> 00:00:06,420 So in this video, we're going to learn about the over clause, which is critical to working with window 3 00:00:06,420 --> 00:00:07,170 functions. 4 00:00:07,350 --> 00:00:10,710 But before we do that, I want us to get on the same page in terms of data. 5 00:00:10,800 --> 00:00:14,280 So you can follow the videos and type this out yourself. 6 00:00:14,280 --> 00:00:19,500 But I would just look at the code that I've included along with this video and run this. 7 00:00:20,010 --> 00:00:21,810 So we make a new table employees. 8 00:00:21,810 --> 00:00:22,620 It's simple. 9 00:00:22,650 --> 00:00:30,330 Employee number department is var char salary is an integer and then we insert just a handful. 10 00:00:30,330 --> 00:00:30,590 Right. 11 00:00:30,600 --> 00:00:35,460 So some engineering salaries, some sales salaries and some customer service salaries. 12 00:00:35,640 --> 00:00:37,080 So I'll go ahead and do this. 13 00:00:37,080 --> 00:00:40,860 Just I'll make a new database, create data base. 14 00:00:40,860 --> 00:00:41,670 What should we call this? 15 00:00:41,670 --> 00:00:43,020 Just I don't know. 16 00:00:43,020 --> 00:00:44,760 We'll call it salaries. 17 00:00:45,660 --> 00:00:49,380 Or no, we'll call it Window Funk's, whatever you want to call it. 18 00:00:49,380 --> 00:00:50,970 You don't even have to make a new database. 19 00:00:51,060 --> 00:00:56,940 I just already had an employees table in my previous database, so I'm going to use this new one window 20 00:00:56,950 --> 00:00:57,520 funks. 21 00:00:57,810 --> 00:00:58,290 All right? 22 00:00:58,290 --> 00:01:02,610 And then I'm going to paste that code to create the table and insert into employees. 23 00:01:02,610 --> 00:01:05,310 I'll do a select star from employees. 24 00:01:05,310 --> 00:01:06,270 And there we are. 25 00:01:06,300 --> 00:01:10,860 We see employee number, department number or department name and salary. 26 00:01:11,400 --> 00:01:19,080 OC So as we saw many times in this course, we could do something like select department comma average 27 00:01:19,080 --> 00:01:25,620 salary from and then employees group by department. 28 00:01:27,280 --> 00:01:29,550 And we discussed this in the previous video. 29 00:01:29,560 --> 00:01:30,130 This works. 30 00:01:30,130 --> 00:01:30,930 This is great. 31 00:01:30,940 --> 00:01:37,960 This is the easiest way to see a single boiled down number for each department's average salary. 32 00:01:37,960 --> 00:01:43,540 So we get three rows, one row that was collapsed down from each of three groups because we only have 33 00:01:43,540 --> 00:01:44,950 three different departments. 34 00:01:44,950 --> 00:01:47,170 So we had different three different groups. 35 00:01:47,170 --> 00:01:53,590 And those groups had know, I don't know, five, six, seven different entries in each group, different 36 00:01:53,590 --> 00:01:57,130 rows all collapse down to a single number for each group. 37 00:01:57,340 --> 00:02:04,930 Now we could even boil it down further and do this select and then just average salary from employees. 38 00:02:04,930 --> 00:02:14,470 And that boils down all of the rows into a single average $68,428, five point or 0.5 $0.07. 39 00:02:14,470 --> 00:02:16,720 I guess we could round it, but we didn't. 40 00:02:16,720 --> 00:02:19,690 That is the average of all of the salaries. 41 00:02:19,690 --> 00:02:23,950 So nothing to do with window functions, just an aggregate function average. 42 00:02:24,340 --> 00:02:32,860 But if I alter my syntax a tiny bit and I include the over clause with parentheses, after an aggregate 43 00:02:32,860 --> 00:02:36,220 function like average, we will get a different result. 44 00:02:36,760 --> 00:02:43,300 So if I try that select average and then over from employees, huh. 45 00:02:43,360 --> 00:02:49,960 Now what we get is the same number 68,428 six 8428. 46 00:02:49,960 --> 00:02:56,320 But we get one value for every single row in our data set or in our selection, which in our case we 47 00:02:56,320 --> 00:02:58,060 selected everything from employees. 48 00:02:58,570 --> 00:03:01,870 So remember how these window functions work. 49 00:03:01,870 --> 00:03:07,630 They take our data set, form it into one or more windows, which we'll talk about. 50 00:03:07,630 --> 00:03:08,980 Think of those as groups. 51 00:03:09,220 --> 00:03:14,620 Run some code with those windows, get a value back, gives us that value with each row. 52 00:03:14,620 --> 00:03:17,590 It doesn't collapse those groups or those windows. 53 00:03:17,830 --> 00:03:25,600 So what we did here when we said select average salary over when you do over with empty parentheses, 54 00:03:25,600 --> 00:03:27,970 that sets up or constructs a window. 55 00:03:27,970 --> 00:03:33,880 But we left it empty, which means a window that includes all the rows, all the records. 56 00:03:33,880 --> 00:03:38,650 Shortly we'll learn how to partition, how to create different sub or smaller windows. 57 00:03:38,650 --> 00:03:46,390 But for now we just said find the average salary for all of our rows together and give us the value 58 00:03:46,390 --> 00:03:48,100 alongside each row. 59 00:03:48,190 --> 00:03:49,900 Now, we didn't select anything else. 60 00:03:49,900 --> 00:03:51,610 We only got the average salary. 61 00:03:51,610 --> 00:03:59,710 But what I could do is this select employee number, salary, department, even. 62 00:03:59,710 --> 00:04:01,060 Let's do department first. 63 00:04:03,060 --> 00:04:12,810 And then average salary over empty parentheses from employees and employees. 64 00:04:13,380 --> 00:04:16,950 And what do we get every row that we selected, which is all rows? 65 00:04:16,950 --> 00:04:18,029 We don't have a wear in there. 66 00:04:18,029 --> 00:04:19,800 We didn't narrow it down in any way. 67 00:04:20,040 --> 00:04:26,490 We have the employee number, the department, the salary, and then also the average for all of our 68 00:04:26,490 --> 00:04:29,520 salaries, which maybe it's kind of weird to do that. 69 00:04:29,520 --> 00:04:30,060 Maybe not. 70 00:04:30,060 --> 00:04:35,550 It could be useful to be able to see individual salary alongside average salary. 71 00:04:35,820 --> 00:04:40,710 But I'm going to show you how we can get more detailed statistics, how we can break this down by department, 72 00:04:40,710 --> 00:04:45,120 for example, and see each department salary alongside each individual salary. 73 00:04:45,120 --> 00:04:46,320 But this is our first step. 74 00:04:46,320 --> 00:04:48,930 By using over, we created a window. 75 00:04:48,930 --> 00:04:53,460 Think of that as a group, a big group, and we didn't specify anything inside the parentheses. 76 00:04:53,460 --> 00:04:54,720 We don't know anything yet. 77 00:04:54,930 --> 00:04:56,850 And so that tells my SQL. 78 00:04:56,850 --> 00:05:02,730 This window includes all the rows, find the average of salary across all the rows, and because we 79 00:05:02,730 --> 00:05:04,740 used over, it doesn't reduce us down. 80 00:05:04,740 --> 00:05:07,590 It doesn't take that window and turn it into a single row. 81 00:05:07,620 --> 00:05:12,120 It gives us that value for every single row that was in the window. 82 00:05:12,630 --> 00:05:17,280 So we also can do this with all the other aggregate functions we've seen so far. 83 00:05:18,000 --> 00:05:20,850 Let me just put this in a file so we have a record of it. 84 00:05:21,360 --> 00:05:25,890 So something like we've seen some, right? 85 00:05:25,890 --> 00:05:28,350 We've seen Men and Macs. 86 00:05:28,980 --> 00:05:30,900 Just show that real quick. 87 00:05:31,170 --> 00:05:37,620 If I want to see the minimum salary from the entire table, this is what I would do. 88 00:05:37,710 --> 00:05:40,020 And then I'll also do the max. 89 00:05:41,140 --> 00:05:44,260 Max salary over. 90 00:05:45,290 --> 00:05:51,740 And to that over makes a big difference because now we see men and Macs alongside each row. 91 00:05:51,740 --> 00:05:55,520 So I can look at, all right, this person made $80,000 as a salary. 92 00:05:55,520 --> 00:05:57,830 The minimum salary in the database is 30. 93 00:05:58,010 --> 00:05:59,570 In the table, 31,000. 94 00:05:59,570 --> 00:06:01,730 Maximum is 159,000. 95 00:06:02,030 --> 00:06:04,310 And we see that for every single row. 96 00:06:04,370 --> 00:06:09,830 And I understand it doesn't seem that useful to have the Min and Max column for every single row just 97 00:06:09,830 --> 00:06:11,030 always be the same. 98 00:06:11,300 --> 00:06:13,130 But it is still different. 99 00:06:13,130 --> 00:06:15,380 And it is it is somewhat useful. 100 00:06:15,590 --> 00:06:20,000 If we wanted to be able to compare somebody's salary to the average, the min and the max, we don't 101 00:06:20,000 --> 00:06:21,050 have to do two queries. 102 00:06:21,050 --> 00:06:27,170 We can look at every salary and see aggregate statistics alongside, but we'll have more control in 103 00:06:27,170 --> 00:06:28,280 the coming videos. 104 00:06:28,580 --> 00:06:34,910 So to sum this up, what we saw, if I get rid of over, let me just show what happens then I'll keep 105 00:06:34,910 --> 00:06:35,810 this version. 106 00:06:37,550 --> 00:06:39,920 And I'll get rid of over there. 107 00:06:40,770 --> 00:06:43,150 And just do min salary and max salary. 108 00:06:43,170 --> 00:06:47,250 We know how to do that, Min and Max, but we're going to have a bad time. 109 00:06:49,940 --> 00:06:56,660 Because we're trying to do a group by expression, but we're also referencing columns that are not part 110 00:06:56,660 --> 00:07:00,470 of the group by they're not grouped together values. 111 00:07:00,470 --> 00:07:02,720 So we've learned about this earlier, right? 112 00:07:02,720 --> 00:07:09,290 If I do a group by even though I didn't explicitly write the clause group by it still is grouping behind 113 00:07:09,290 --> 00:07:09,680 the scenes. 114 00:07:09,680 --> 00:07:15,530 When we do Min and Max, it's trying to do some aggregate stuff and boil it down to a single value like 115 00:07:15,530 --> 00:07:16,640 if we did this. 116 00:07:19,230 --> 00:07:21,890 31,150 9000. 117 00:07:21,900 --> 00:07:23,460 That is how men and Macs work. 118 00:07:23,460 --> 00:07:27,680 When we don't call them as window functions, we get a single value. 119 00:07:27,690 --> 00:07:34,040 So we were trying to get a single value min and max alongside all the rows of salary department, all 120 00:07:34,050 --> 00:07:34,950 those values. 121 00:07:34,950 --> 00:07:37,590 And the way to do that is by using over. 122 00:07:37,800 --> 00:07:40,710 So this is going to give us the same number. 123 00:07:40,710 --> 00:07:43,740 At the end of the day, 31,150 9000. 124 00:07:43,740 --> 00:07:48,480 But the output is very different and soon we'll see that the way that it's actually performed is very 125 00:07:48,480 --> 00:07:49,290 different as well. 126 00:07:49,590 --> 00:07:53,040 So Min and Max Sum, all the different aggregate functions work. 127 00:07:53,040 --> 00:07:55,830 I don't need to I don't think I need to show them all to you. 128 00:07:55,830 --> 00:08:01,020 And there's also a handful of window only functions that we're going to learn about in this section 129 00:08:01,020 --> 00:08:01,650 as well. 130 00:08:01,650 --> 00:08:04,620 But so far we've seen things like average men and Macs. 131 00:08:04,620 --> 00:08:07,080 Those are regular aggregate functions. 132 00:08:07,080 --> 00:08:08,730 You don't have to use over with them. 133 00:08:08,730 --> 00:08:11,790 But if you do, we get this very different result. 134 00:08:12,510 --> 00:08:15,720 So next up, we're going to learn about what we can put inside of over.