1 00:00:00,280 --> 00:00:01,360 Hey, what's up Gurus. 2 00:00:01,360 --> 00:00:03,930 In this lesson, we are going to talk about 3 00:00:03,930 --> 00:00:06,008 building external tables. 4 00:00:06,008 --> 00:00:07,309 What do I mean by that? 5 00:00:07,309 --> 00:00:09,680 Well, we're going to take a look at first, 6 00:00:09,680 --> 00:00:11,890 what is an external table? 7 00:00:11,890 --> 00:00:14,170 Then we're going to talk about the syntax that you need 8 00:00:14,170 --> 00:00:16,343 in order to build external tables. 9 00:00:17,320 --> 00:00:19,180 So let's get started by talking about 10 00:00:19,180 --> 00:00:21,990 what an external table actually is. 11 00:00:21,990 --> 00:00:25,550 Well, an external table is just a table that comes 12 00:00:25,550 --> 00:00:29,390 from files that are stored outside of your database. 13 00:00:29,390 --> 00:00:32,060 Pretty simple, external table. 14 00:00:32,060 --> 00:00:35,620 It is not used for frequently access tables. 15 00:00:35,620 --> 00:00:37,430 So you would not use this lesson of 16 00:00:37,430 --> 00:00:41,700 the stuff we're talking about for frequently access tables, 17 00:00:41,700 --> 00:00:43,160 but why do we care? 18 00:00:43,160 --> 00:00:45,720 Well, when we need to access data 19 00:00:45,720 --> 00:00:48,600 without copying an entire data set, 20 00:00:48,600 --> 00:00:51,840 an external table is an excellent way to do it. 21 00:00:51,840 --> 00:00:54,560 Basically, it gives you a way to access data 22 00:00:54,560 --> 00:00:57,870 hosted outside of your database, quickly. 23 00:00:57,870 --> 00:01:00,480 Having said that, now that we have some basics around 24 00:01:00,480 --> 00:01:02,180 what an external table is, 25 00:01:02,180 --> 00:01:04,940 let's take a look at some sample syntax 26 00:01:04,940 --> 00:01:07,530 so that we can start to understand what we actually do. 27 00:01:07,530 --> 00:01:09,210 So there's a couple of ways that we can do this. 28 00:01:09,210 --> 00:01:14,210 We can create and alter external tables in Azure Storage 29 00:01:14,230 --> 00:01:18,290 and we can also use external tables with Synapse. 30 00:01:18,290 --> 00:01:20,670 For our purposes, I am going to jump us in 31 00:01:20,670 --> 00:01:23,880 and we'll talk about Azure Storage as our example. 32 00:01:23,880 --> 00:01:25,500 So what you're looking at here is 33 00:01:25,500 --> 00:01:28,150 a sample script for Azure Storage. 34 00:01:28,150 --> 00:01:29,990 So let's start by kind of breaking down 35 00:01:29,990 --> 00:01:32,683 what's going on with the syntax. 36 00:01:34,570 --> 00:01:36,910 So the very first line here, we have 3 options. 37 00:01:36,910 --> 00:01:41,100 We have .create, .alter, or .create-or-alter. 38 00:01:41,100 --> 00:01:45,730 So .create is going to fail if your table already exists. 39 00:01:45,730 --> 00:01:49,520 So you want to use this if you don't have existing tables. 40 00:01:49,520 --> 00:01:54,510 You would use .alter if your table already exists 41 00:01:54,510 --> 00:01:56,540 and we just want to change what's there. 42 00:01:56,540 --> 00:01:59,570 Obviously, if there's not a table, that's going to fail 43 00:01:59,570 --> 00:02:02,540 or we could use the .create-or-altar if we're not sure 44 00:02:02,540 --> 00:02:04,230 and this is not going to fail either way. 45 00:02:04,230 --> 00:02:07,527 It's just going to move forward and create and 46 00:02:07,527 --> 00:02:09,760 or replace your table. 47 00:02:09,760 --> 00:02:10,940 And so then after that 48 00:02:10,940 --> 00:02:13,370 we're simply going to use the external table 49 00:02:13,370 --> 00:02:18,270 and we are going to include the schema that we want to use. 50 00:02:18,270 --> 00:02:20,310 And I'll show you an example of what that looks like 51 00:02:20,310 --> 00:02:22,653 in a minute, but that's where that goes. 52 00:02:23,980 --> 00:02:27,040 Next up, we are going to specify our kind. 53 00:02:27,040 --> 00:02:29,410 And so this is just the type of external table 54 00:02:29,410 --> 00:02:30,690 that we want to create. 55 00:02:30,690 --> 00:02:35,283 So we could use Storage or SQL or whatever we're looking at. 56 00:02:37,310 --> 00:02:38,760 Next partition by 57 00:02:38,760 --> 00:02:40,410 and we've talked quite a bit about partitions, 58 00:02:40,410 --> 00:02:43,370 but this is just going to be our comma separated list 59 00:02:43,370 --> 00:02:44,470 of columns that we want. 60 00:02:44,470 --> 00:02:46,240 So I'm not going to dive too far into that. 61 00:02:46,240 --> 00:02:47,640 You would do your partition by 62 00:02:47,640 --> 00:02:49,393 and then include your path format. 63 00:02:50,630 --> 00:02:53,560 And the backslash is going to be the default 64 00:02:53,560 --> 00:02:54,940 if you don't do anything here. 65 00:02:54,940 --> 00:02:56,750 So if you don't include that line at all, 66 00:02:56,750 --> 00:02:58,400 it's just going to use that default. 67 00:03:00,720 --> 00:03:02,290 And then finally down here at the bottom, 68 00:03:02,290 --> 00:03:04,790 we need to specify our data format. 69 00:03:04,790 --> 00:03:07,560 So there's a couple of different formats that we can use 70 00:03:07,560 --> 00:03:09,620 and you would specify .avro, 71 00:03:09,620 --> 00:03:13,310 or CSV, or JSON, or parquet, those are the most common ones. 72 00:03:13,310 --> 00:03:15,910 There are more, but you would look and see what you wanted. 73 00:03:15,910 --> 00:03:18,180 That's going to be there under that format. 74 00:03:18,180 --> 00:03:20,270 And then we would just do our connection string 75 00:03:20,270 --> 00:03:23,870 so that we know how to connect to our data. 76 00:03:23,870 --> 00:03:25,540 Alright. So that's the basics. I'm going to jump over now 77 00:03:25,540 --> 00:03:28,960 and I'm going to show you an example. 78 00:03:28,960 --> 00:03:30,610 So what we just talked about is on the top 79 00:03:30,610 --> 00:03:32,780 and then I did an example down here at the bottom, 80 00:03:32,780 --> 00:03:34,850 so you can kind of see what it looks like. 81 00:03:34,850 --> 00:03:37,420 So in the example, at the bottom, we did .create. 82 00:03:37,420 --> 00:03:39,980 So we're choosing to create a table. 83 00:03:39,980 --> 00:03:42,480 If it already exists, this is going to fail. 84 00:03:42,480 --> 00:03:43,820 So we have our .create. 85 00:03:43,820 --> 00:03:44,970 We have our table 86 00:03:44,970 --> 00:03:47,250 and then we're just creating an external table here. 87 00:03:47,250 --> 00:03:49,530 And then we are going to give it our schema, 88 00:03:49,530 --> 00:03:52,253 we're going to use Timestamp and CourseName. 89 00:03:53,140 --> 00:03:56,550 And so the Timestamp is going to be our column 90 00:03:56,550 --> 00:03:58,360 and then the colon. 91 00:03:58,360 --> 00:04:00,650 And then after that, we're going to have the column type. 92 00:04:00,650 --> 00:04:02,910 So this is going to be datetime or string. 93 00:04:02,910 --> 00:04:06,900 And so you would just specify all of the columns 94 00:04:06,900 --> 00:04:10,023 that go into your external table that you want to create. 95 00:04:11,430 --> 00:04:13,970 Kind of table we're creating. Next line, storage, 96 00:04:13,970 --> 00:04:16,110 so we're just going to do our storage here. 97 00:04:16,110 --> 00:04:17,930 And then we're going to partition this thing 98 00:04:17,930 --> 00:04:18,840 and we're going to partition it 99 00:04:18,840 --> 00:04:23,060 by month using datetime at the start of the month. 100 00:04:23,060 --> 00:04:26,890 So we would specify our partition information here 101 00:04:26,890 --> 00:04:29,200 and then below that you can see our path format. So this 102 00:04:29,200 --> 00:04:32,540 is just going to be where we specify the path format. 103 00:04:32,540 --> 00:04:34,860 So it's going to be year/month. 104 00:04:34,860 --> 00:04:36,920 So you can see that there as well. 105 00:04:36,920 --> 00:04:39,760 And then for our data format, I'm going to use parquet 106 00:04:39,760 --> 00:04:42,780 and then below there in red, my files are located here. 107 00:04:42,780 --> 00:04:45,970 That's just going to be that storage connection string 108 00:04:45,970 --> 00:04:47,310 that we would include. 109 00:04:47,310 --> 00:04:49,870 So that is the basics of the syntax 110 00:04:49,870 --> 00:04:52,173 for creating external tables. 111 00:04:53,370 --> 00:04:54,950 Couple of points to remember: 112 00:04:54,950 --> 00:04:59,950 One, these are for tables that are outside of your database. 113 00:05:00,080 --> 00:05:03,780 And then two, we are using a KUSTO schema 114 00:05:03,780 --> 00:05:06,180 in order to create those external tables 115 00:05:06,180 --> 00:05:09,100 in Azure Blob Storage or Azure Data Lake. 116 00:05:09,100 --> 00:05:12,870 Now keep in mind, you can also do this in Azure Synapse. 117 00:05:12,870 --> 00:05:15,530 So don't get thrown off on the DP-203 118 00:05:15,530 --> 00:05:16,920 thinking you can only do this 119 00:05:16,920 --> 00:05:19,380 in Blob Storage or a Data Lake. 120 00:05:19,380 --> 00:05:20,510 If you got that down, 121 00:05:20,510 --> 00:05:23,360 you should be just fine to move on to the next lesson. 122 00:05:23,360 --> 00:05:28,360 Again, external tables - outside of your database, 123 00:05:28,470 --> 00:05:31,290 Blob Storage, Data Lake, and Azure Synapse. 124 00:05:31,290 --> 00:05:33,370 Alright, that's it for this lesson, 125 00:05:33,370 --> 00:05:34,620 I'll see you in the next.