1 00:00:00,380 --> 00:00:02,690 In this lesson, we are going to dive 2 00:00:02,690 --> 00:00:05,620 just a little bit deeper off of the last lesson 3 00:00:05,620 --> 00:00:08,980 and we are going to talk about understanding dimensions. 4 00:00:08,980 --> 00:00:11,080 So what we're going to do is we're going to take a look 5 00:00:11,080 --> 00:00:15,210 at fact tables and fact table grains, 6 00:00:15,210 --> 00:00:17,150 and then we're going to dive slightly deeper 7 00:00:17,150 --> 00:00:19,150 into database dimensions 8 00:00:19,150 --> 00:00:21,490 and talk about that a little bit more. 9 00:00:21,490 --> 00:00:23,170 Then we'll finish up by talking about 10 00:00:23,170 --> 00:00:26,490 relationships between fact and dimension tables. 11 00:00:26,490 --> 00:00:28,383 So with that, let's dive on in. 12 00:00:29,240 --> 00:00:31,361 So let's start off by talking about fact tables 13 00:00:31,361 --> 00:00:34,490 and fact table grains. 14 00:00:34,490 --> 00:00:36,590 And for this analogy, I'm going to take us into a 15 00:00:36,590 --> 00:00:40,066 grocery store and we're going to talk about everything 16 00:00:40,066 --> 00:00:43,440 in the terms of what we would find in a grocery store. 17 00:00:43,440 --> 00:00:45,670 So let's start off with our fact table. 18 00:00:45,670 --> 00:00:49,750 Now, if you remember fact tables contain numeric data. 19 00:00:49,750 --> 00:00:53,710 So we could be looking at a store profits 20 00:00:53,710 --> 00:00:57,260 or we could look at the product sales that are going across 21 00:00:57,260 --> 00:00:59,820 as they beep through the register. 22 00:00:59,820 --> 00:01:02,620 And we could look at sales by register. 23 00:01:02,620 --> 00:01:04,240 These are all kinds of things 24 00:01:04,240 --> 00:01:07,053 that could lie in that fact table. 25 00:01:08,030 --> 00:01:12,030 And so each row is going to represent a single event. 26 00:01:12,030 --> 00:01:14,703 For instance, it might represent a product sale. 27 00:01:15,820 --> 00:01:19,780 And so each piece of information helps us to measure. 28 00:01:19,780 --> 00:01:21,940 So for instance, we could be looking at 29 00:01:21,940 --> 00:01:25,400 how successful we are, how many products did we sell? 30 00:01:25,400 --> 00:01:27,620 So those individual product sales help us 31 00:01:27,620 --> 00:01:29,840 to measure how profitable we are 32 00:01:29,840 --> 00:01:33,323 or how many sales we have of a particular product. 33 00:01:34,610 --> 00:01:37,659 And so then that takes us to the fact table grain. 34 00:01:37,659 --> 00:01:42,210 Now the fact table grain helps us to identify exactly 35 00:01:42,210 --> 00:01:45,220 what's inside of the fact table record. 36 00:01:45,220 --> 00:01:48,110 Good way to think about it is like the business description 37 00:01:48,110 --> 00:01:50,710 of what's going to be in that record. 38 00:01:50,710 --> 00:01:52,860 So it communicates the level of detail 39 00:01:52,860 --> 00:01:54,110 that we're going to have. 40 00:01:55,090 --> 00:01:57,260 And so for this case, we could look at it 41 00:01:57,260 --> 00:02:01,600 and we could say that our main fact table is sales. 42 00:02:01,600 --> 00:02:04,710 And so all of the data that we we have in that fact table 43 00:02:04,710 --> 00:02:08,813 is going to describe the sales of our particular store. 44 00:02:11,480 --> 00:02:14,440 So then looking at our dimensions. 45 00:02:14,440 --> 00:02:18,020 Dimensions then are going to add additional description 46 00:02:18,020 --> 00:02:20,400 to those fact tables. 47 00:02:20,400 --> 00:02:22,670 So they're not just going to be numeric data, 48 00:02:22,670 --> 00:02:24,500 it's going to be all kinds of things. 49 00:02:24,500 --> 00:02:28,250 For instance, I could have the day, month, 50 00:02:28,250 --> 00:02:32,960 year of the product sales, or I could have the brand name 51 00:02:32,960 --> 00:02:34,450 of the chips that we're selling, 52 00:02:34,450 --> 00:02:39,450 or the type of chips, so barbecue or whatever, right? 53 00:02:40,097 --> 00:02:43,680 So those dimension tables are then going to describe 54 00:02:43,680 --> 00:02:46,590 the sales going on in the fact table. 55 00:02:46,590 --> 00:02:48,580 And they're going to provide us details 56 00:02:48,580 --> 00:02:53,030 about each instance of an object, so our product detail. 57 00:02:53,030 --> 00:02:55,520 And the level of detail that we include 58 00:02:55,520 --> 00:02:59,390 is incredibly important because if we add too much detail 59 00:02:59,390 --> 00:03:01,640 it's going to bog down our queries 60 00:03:01,640 --> 00:03:04,240 and make it very hard to find the data that we need. 61 00:03:04,240 --> 00:03:06,310 If we don't add enough detail 62 00:03:06,310 --> 00:03:08,400 then we're going to have a hard time getting the reports 63 00:03:08,400 --> 00:03:10,403 that we want at a business level. 64 00:03:11,360 --> 00:03:13,070 So now let's jump in 65 00:03:13,070 --> 00:03:16,070 and let's talk a little bit about the relationship 66 00:03:16,070 --> 00:03:19,520 between those fact and dimension tables. 67 00:03:19,520 --> 00:03:23,540 So a fact and a dimension table have a relationship 68 00:03:23,540 --> 00:03:28,540 and the relationship is defined by a primary key. 69 00:03:28,670 --> 00:03:31,190 So this is a unique data column that helps us 70 00:03:31,190 --> 00:03:33,070 to define relationships. 71 00:03:33,070 --> 00:03:37,880 So basically let's say that we have our sales example again 72 00:03:37,880 --> 00:03:41,710 and we have all of the sales in our fact table 73 00:03:41,710 --> 00:03:46,260 and then we have a description describing the products 74 00:03:46,260 --> 00:03:48,270 in a dimension table. 75 00:03:48,270 --> 00:03:51,380 Well, we could have an item key for instance 76 00:03:51,380 --> 00:03:53,800 that could tie together the fact table 77 00:03:53,800 --> 00:03:55,570 and the dimension table. 78 00:03:55,570 --> 00:03:58,410 That key builds that relationship. 79 00:03:58,410 --> 00:04:02,640 Or another example, let's say that we have a student ID. 80 00:04:02,640 --> 00:04:05,480 So we're looking at grades that a student gets. 81 00:04:05,480 --> 00:04:06,390 So we have our grades 82 00:04:06,390 --> 00:04:08,550 and we'll have multiple grades of columns 83 00:04:08,550 --> 00:04:12,010 and then we'll have an average there as a final column. 84 00:04:12,010 --> 00:04:15,520 We could use a student ID, the first column there 85 00:04:15,520 --> 00:04:18,520 to be a relationship builder 86 00:04:18,520 --> 00:04:22,760 to help us to tie in to additional tables. 87 00:04:22,760 --> 00:04:25,670 Now, beyond my primary key, I can also have 88 00:04:25,670 --> 00:04:28,750 a foreign key that can provide an additional link 89 00:04:28,750 --> 00:04:31,040 between data in 2 tables. 90 00:04:31,040 --> 00:04:31,873 So for instance, 91 00:04:31,873 --> 00:04:36,340 I could have a class table that provides a list of classes 92 00:04:36,340 --> 00:04:39,090 and then it provides the grades in those classes 93 00:04:39,090 --> 00:04:41,330 and then has a teacher ID. 94 00:04:41,330 --> 00:04:44,270 Now that teacher ID could be linked to another table 95 00:04:44,270 --> 00:04:47,600 that provides the first and last name of that teacher. 96 00:04:47,600 --> 00:04:51,220 So I could actually link those 2 tables together 97 00:04:51,220 --> 00:04:53,130 by that teacher ID. 98 00:04:53,130 --> 00:04:56,130 So the important thing to remember here is fact 99 00:04:56,130 --> 00:04:58,250 and dimension tables have a relationship 100 00:04:58,250 --> 00:05:02,773 and that relationship is defined by the keys that you use. 101 00:05:04,210 --> 00:05:07,722 So let's tie all of this together here 102 00:05:07,722 --> 00:05:12,350 and talk about what we need to know for the DP-203. 103 00:05:12,350 --> 00:05:16,330 So we talked about fact tables and fact table grains. 104 00:05:16,330 --> 00:05:20,350 Fact table grains can absolutely make or break a design. 105 00:05:20,350 --> 00:05:23,860 You have to make sure that the grain that you're using 106 00:05:23,860 --> 00:05:27,440 matches the business description and you have to make sure 107 00:05:27,440 --> 00:05:30,750 that the data in there is relevant to that grain. 108 00:05:30,750 --> 00:05:33,280 So that's very important to make sure that you don't get 109 00:05:33,280 --> 00:05:37,380 a table that gets too busy or doesn't provide enough clarity 110 00:05:37,380 --> 00:05:40,430 or information to get the data that you need out. 111 00:05:40,430 --> 00:05:42,680 How many dimension levels do you need? 112 00:05:42,680 --> 00:05:45,740 This is another critical point to think about. 113 00:05:45,740 --> 00:05:49,110 You can have a star table with 1 dimension level 114 00:05:49,110 --> 00:05:50,610 or you could have a snowflake table 115 00:05:50,610 --> 00:05:53,000 with 20 dimension levels. 116 00:05:53,000 --> 00:05:55,850 The levels that you add is going to add complexity 117 00:05:55,850 --> 00:05:57,400 to your database. 118 00:05:57,400 --> 00:06:00,750 So make sure that you're defining your dimension levels 119 00:06:00,750 --> 00:06:02,950 and the data in a way that is appropriate 120 00:06:02,950 --> 00:06:04,640 and provides the detail needed 121 00:06:04,640 --> 00:06:06,490 to get the data that you need. 122 00:06:06,490 --> 00:06:08,500 It all comes back to that query 123 00:06:08,500 --> 00:06:11,030 and making sure that you're matching the business design 124 00:06:11,030 --> 00:06:13,743 and what the table or database really needs to be. 125 00:06:14,720 --> 00:06:17,810 Finally, the key is the key. 126 00:06:17,810 --> 00:06:21,640 Make sure that your foreign and primary keys make sense 127 00:06:21,640 --> 00:06:23,780 and that you can tie those together to pull all 128 00:06:23,780 --> 00:06:26,550 the tables together to get the data that you need. 129 00:06:26,550 --> 00:06:28,070 So for the DP-203, 130 00:06:28,070 --> 00:06:30,570 what you really need to understand is the basics. 131 00:06:30,570 --> 00:06:32,440 You need to understand the basics about keys. 132 00:06:32,440 --> 00:06:33,660 You need to understand the basics 133 00:06:33,660 --> 00:06:35,800 about fact and dimension tables 134 00:06:35,800 --> 00:06:38,480 and just a little bit of database design. 135 00:06:38,480 --> 00:06:39,450 If you've got that down, 136 00:06:39,450 --> 00:06:43,020 you should be good to move on into the next lesson. 137 00:06:43,020 --> 00:06:45,080 So with that, we'll stop talking about schemas 138 00:06:45,080 --> 00:06:46,033 and we'll move on.