1
00:00:01,160 --> 00:00:02,060
Hey.

2
00:00:02,090 --> 00:00:04,630
Hey, welcome to the next section.

3
00:00:04,640 --> 00:00:08,330
So this section, as you can probably gather, we're still working with our books data.

4
00:00:08,330 --> 00:00:12,050
I hope you're not tired of it yet if you are almost done.

5
00:00:12,230 --> 00:00:16,940
But it's important that we keep working with one data set for a little bit at least so that you can

6
00:00:16,940 --> 00:00:18,050
get familiar with it.

7
00:00:18,050 --> 00:00:22,880
And the reason that that matters is that as we do, some of the more advanced things, we want you to

8
00:00:22,880 --> 00:00:24,140
be able to check your work.

9
00:00:24,140 --> 00:00:29,810
So in this section we're going to learn a lot about different ways of performing analysis on data.

10
00:00:29,810 --> 00:00:35,480
So things like finding averages or summing a bunch of data together, grouping things by authors and

11
00:00:35,480 --> 00:00:39,440
calculating average quantities or page numbers per author.

12
00:00:39,560 --> 00:00:45,620
These are operations that if we had 10,000 books or 10,000 something else, it's really hard to know

13
00:00:45,620 --> 00:00:50,660
if you're doing it right or wrong, you get an answer, just a number, let's say 67.5.

14
00:00:50,660 --> 00:00:52,610
And how could you know if that's right or wrong?

15
00:00:52,880 --> 00:00:58,190
But if we're working with books and we have 20 of them and you're familiar with the page count and the

16
00:00:58,190 --> 00:01:04,220
authors, you'll know, okay, this author has three books that seems right versus okay, that author

17
00:01:04,220 --> 00:01:05,060
only has two books.

18
00:01:05,060 --> 00:01:06,080
Why are we saying that?

19
00:01:06,080 --> 00:01:09,530
You know, we have three or something like that terrible example.

20
00:01:09,530 --> 00:01:11,090
But the same idea is true.

21
00:01:11,180 --> 00:01:12,980
We know our data at this point.

22
00:01:12,980 --> 00:01:17,900
We're going to at the end of this course and along the way, we're going to keep upgrading our data

23
00:01:17,900 --> 00:01:22,340
to more complex structures, more tables, more rows, complex stuff.

24
00:01:22,640 --> 00:01:28,610
But at the very end, kind of the Capstone case study will be working with Instagram esque data, fake

25
00:01:28,610 --> 00:01:32,570
data for Instagram, and we'll have thousands and thousands of rows.

26
00:01:32,570 --> 00:01:36,800
And you won't actually know if you're doing things right or wrong based off the number you're getting,

27
00:01:36,800 --> 00:01:41,930
unless you've manually checked your work by doing 1000 additions or something.

28
00:01:42,140 --> 00:01:45,650
So all that's to say, stick with the books if you can.

29
00:01:46,100 --> 00:01:51,080
And in this section we're focusing a lot on these new aggregate functions.

30
00:01:51,080 --> 00:01:58,670
So those are things like finding averages, counting summing things together based off of grouping data.

31
00:01:58,670 --> 00:02:02,150
So it's a bit hard to explain in a headshot without showing you code.

32
00:02:02,150 --> 00:02:05,480
So I'll let the code do that in just a few videos from now.

33
00:02:05,480 --> 00:02:10,669
But the rough idea is that we take our data and there's all sorts of insights we can gain.

34
00:02:10,669 --> 00:02:17,080
So rather than just working with an individual row or a group of rows, we can combine things into like

35
00:02:17,090 --> 00:02:18,050
mega rows.

36
00:02:18,050 --> 00:02:24,410
I can combine all of our authors and group books based off of who wrote them or group books based off

37
00:02:24,410 --> 00:02:28,610
of what year they're written in and then perform operations on those groups.

38
00:02:28,610 --> 00:02:35,960
So it allows me to do things like find the average sales that we've had per year, or we could do things

39
00:02:35,960 --> 00:02:43,190
like find the average page number for books per genre, and then we could expand that obviously to more

40
00:02:43,190 --> 00:02:44,000
complex stuff.

41
00:02:44,000 --> 00:02:48,740
If you're working with advertising data or let's say our Instagram data, we'll be able to at the end

42
00:02:48,740 --> 00:02:55,310
of the course do things like find out which one of our users is a power user or influencers, but they're

43
00:02:55,310 --> 00:03:01,670
called, I guess, meaning that they have the most comments, the most likes on each one of their posts

44
00:03:01,670 --> 00:03:02,450
on average.

45
00:03:02,450 --> 00:03:06,800
So who in our database is getting on average the most likes and comments?

46
00:03:06,800 --> 00:03:10,520
Or we could do things like which hashtag generates the most traction.

47
00:03:10,520 --> 00:03:15,680
And to do that, we would need to analyze all of our hashtags and then take all of our photos that have

48
00:03:15,680 --> 00:03:21,290
those hashtags, group them together, figure out which one of those hashtags generates the most likes.

49
00:03:21,290 --> 00:03:24,620
So there's a lot of stuff that we can do with these aggregate functions.

50
00:03:24,620 --> 00:03:29,870
They form the backbone of a lot of the questions and analysis that we'll do throughout the course.

51
00:03:29,870 --> 00:03:31,550
All right, I'm rambling now.

52
00:03:31,550 --> 00:03:32,570
I'm going to go away.

53
00:03:32,600 --> 00:03:34,520
Hopefully you enjoy this section.

54
00:03:34,520 --> 00:03:37,640
It's important to say that a lot, but it really is.

55
00:03:37,640 --> 00:03:42,380
And of course, we'll have a bunch of exercises throughout the course, throughout the section, and

56
00:03:42,380 --> 00:03:43,580
especially at the end.

57
00:03:43,580 --> 00:03:45,200
And I'm trying to keep those interesting.

58
00:03:45,200 --> 00:03:46,280
Look forward to those.

59
00:03:46,280 --> 00:03:50,150
If you don't just remember, it's a database course.

60
00:03:50,150 --> 00:03:51,350
I'm trying my best.

61
00:03:51,680 --> 00:03:52,010
It's.

62
00:03:52,340 --> 00:03:53,480
It's databases.

63
00:03:53,720 --> 00:03:55,190
All right, I'm done.