1 00:00:01,140 --> 00:00:03,250 So now that we've introduced the players, 2 00:00:03,250 --> 00:00:05,728 in this lesson, we are going to be talking about 3 00:00:05,728 --> 00:00:09,830 designing a stream processing solution. 4 00:00:09,830 --> 00:00:12,540 So in this lesson, we are going to take a look 5 00:00:12,540 --> 00:00:17,500 and focus on a quick review of Azure Stream Analytics. 6 00:00:17,500 --> 00:00:21,600 Then, we're going to see Azure Stream Analytics in action. 7 00:00:21,600 --> 00:00:22,750 I wanted to introduce it 8 00:00:22,750 --> 00:00:24,980 by actually jumping into the Azure portal 9 00:00:24,980 --> 00:00:28,091 and kind of walking through just a very high-level scenario 10 00:00:28,091 --> 00:00:31,120 and showing you a little bit of how it works. I 11 00:00:31,120 --> 00:00:33,370 think that's going to help you as we move through the 12 00:00:33,370 --> 00:00:34,800 lessons to have a little bit better understanding 13 00:00:34,800 --> 00:00:37,223 as we talk about concepts. 14 00:00:38,730 --> 00:00:43,640 So with that, welcome back to Azure Stream Analytics. 15 00:00:43,640 --> 00:00:46,630 Azure Stream Analytics, if you remember from Microsoft, 16 00:00:46,630 --> 00:00:49,850 is the fully managed real-time analytics service 17 00:00:49,850 --> 00:00:51,330 designed to help you analyze 18 00:00:51,330 --> 00:00:53,850 and process fast moving streams. 19 00:00:53,850 --> 00:00:55,290 And you can read the rest. 20 00:00:55,290 --> 00:00:57,368 Basically, it is the tool 21 00:00:57,368 --> 00:01:00,435 that we're going to focus on the most for the DP-203 22 00:01:00,435 --> 00:01:04,580 to move data in a streaming fashion through Azure, 23 00:01:04,580 --> 00:01:07,650 as well as do some cleanup and transformation on that data 24 00:01:07,650 --> 00:01:10,951 as it passes through Azure Stream Analytics. 25 00:01:10,951 --> 00:01:15,932 At its core, you can have Event Hubs, IoT Hubs, 26 00:01:15,932 --> 00:01:19,270 Blob storage, you have some sort of input, 27 00:01:19,270 --> 00:01:23,150 then, you're going to do some sort of transformation, 28 00:01:23,150 --> 00:01:24,660 which is our second piece, 29 00:01:24,660 --> 00:01:27,330 and that's going to be using a query language 30 00:01:27,330 --> 00:01:29,570 in Azure Stream Analytics. 31 00:01:29,570 --> 00:01:32,730 So we have our input, pull data through, 32 00:01:32,730 --> 00:01:34,320 do some transformation on it 33 00:01:34,320 --> 00:01:36,650 as it's passing through Stream Analytics, 34 00:01:36,650 --> 00:01:38,550 and then, we have our output. 35 00:01:38,550 --> 00:01:41,150 We are going to send it somewhere to store 36 00:01:41,150 --> 00:01:43,070 and save the results or pass it 37 00:01:43,070 --> 00:01:47,330 into some other process in our cloud journey. 38 00:01:47,330 --> 00:01:48,163 So at its core, 39 00:01:48,163 --> 00:01:51,080 you need to think about that for Azure Stream Analytics. 40 00:01:51,080 --> 00:01:54,433 Everything comes down to input, query, and output. 41 00:01:55,640 --> 00:01:59,750 Now, with that very brief refresher, let's see it in action. 42 00:01:59,750 --> 00:02:01,530 Let's just go ahead and jump into the portal 43 00:02:01,530 --> 00:02:05,820 and we will talk more about Azure Stream Analytics. 44 00:02:05,820 --> 00:02:08,740 So I've opened up a portal 45 00:02:08,740 --> 00:02:12,540 and created a Stream Analytics job, 46 00:02:12,540 --> 00:02:16,280 and we are looking now at that Stream Analytics job. 47 00:02:16,280 --> 00:02:20,010 Hasn't run yet, haven't configured any inputs or outputs, 48 00:02:20,010 --> 00:02:22,040 and there is no query as well, 49 00:02:22,040 --> 00:02:24,670 so it looks pretty barren. 50 00:02:24,670 --> 00:02:25,860 So when we start off, 51 00:02:25,860 --> 00:02:29,020 we're always going to take a look at our inputs. 52 00:02:29,020 --> 00:02:33,420 And so when you think about creating a Stream Analytics job, 53 00:02:33,420 --> 00:02:34,960 a good way to consider that is honestly 54 00:02:34,960 --> 00:02:36,490 just to get like a whiteboard, 55 00:02:36,490 --> 00:02:38,330 kind of mock up where all the inputs 56 00:02:38,330 --> 00:02:41,440 are going to be coming in from the various IoT sensors 57 00:02:41,440 --> 00:02:43,490 or whatever it is that you have. 58 00:02:43,490 --> 00:02:46,410 Think about what you need to do to get that data 59 00:02:46,410 --> 00:02:48,900 into a format where it's usable, 60 00:02:48,900 --> 00:02:53,100 either for a report or for another step in your process. 61 00:02:53,100 --> 00:02:55,600 And then, just simply define your outputs. 62 00:02:55,600 --> 00:02:58,100 Once you've done that, on your whiteboard, 63 00:02:58,100 --> 00:02:59,740 you're ready to start writing the queries, 64 00:02:59,740 --> 00:03:01,690 ready to start building the inputs and outputs, 65 00:03:01,690 --> 00:03:02,853 so that you can test. 66 00:03:03,690 --> 00:03:05,860 So we're going to start by building our input. 67 00:03:05,860 --> 00:03:09,270 We're going to click on Input, add a stream input. 68 00:03:09,270 --> 00:03:11,130 And if you remember we have 3 choices, 69 00:03:11,130 --> 00:03:14,270 Event Hub, IoT Hub, and Blob storage. 70 00:03:14,270 --> 00:03:16,780 We're going to choose Blob storage now. 71 00:03:16,780 --> 00:03:20,570 For my input alias, I just need to name this thing. 72 00:03:20,570 --> 00:03:23,123 So let's just go ahead and call it testinput1. 73 00:03:24,440 --> 00:03:26,390 We're going to choose our storage account, 74 00:03:26,390 --> 00:03:28,940 and I'm just picking one that I've already created. 75 00:03:30,440 --> 00:03:32,650 And we can use our existing containers, 76 00:03:32,650 --> 00:03:33,913 and that's just fine. 77 00:03:35,700 --> 00:03:37,180 Now, for our authentication mode, 78 00:03:37,180 --> 00:03:39,440 if you have trouble with our authentication, 79 00:03:39,440 --> 00:03:41,670 you would go through and follow the manual steps 80 00:03:41,670 --> 00:03:43,760 to grant authentication here. 81 00:03:43,760 --> 00:03:45,422 And I'm going to leave the rest of this alone 82 00:03:45,422 --> 00:03:48,300 because I'm just trying to show you how this would work. 83 00:03:48,300 --> 00:03:49,510 Now, when I click on Save, 84 00:03:49,510 --> 00:03:51,050 it's going to fail because I haven't done 85 00:03:51,050 --> 00:03:53,050 the manual authentication steps, 86 00:03:53,050 --> 00:03:55,710 but for what we're doing now, that's really okay. 87 00:03:55,710 --> 00:03:56,760 Yeah, and you can see here, 88 00:03:56,760 --> 00:04:01,052 here is our manual setup request for our Blob storage, 89 00:04:01,052 --> 00:04:05,076 but it's going to go ahead and create that test input, 90 00:04:05,076 --> 00:04:08,480 so that is created now. 91 00:04:08,480 --> 00:04:10,930 And so you would do that for all of your inputs. 92 00:04:10,930 --> 00:04:13,210 Then, you would come in to Query 93 00:04:13,210 --> 00:04:16,710 and you would build your query language. 94 00:04:16,710 --> 00:04:17,543 And I'm going to go ahead 95 00:04:17,543 --> 00:04:20,760 and just paste a query language in here for now. 96 00:04:20,760 --> 00:04:22,380 And you can see that as I do that, 97 00:04:22,380 --> 00:04:27,380 it's going to update my inputs and my outputs, 98 00:04:27,930 --> 00:04:30,450 so you can see what's going on there. 99 00:04:30,450 --> 00:04:32,193 And so for my input, 100 00:04:34,060 --> 00:04:38,483 I can go ahead and change that to testinput1. 101 00:04:39,620 --> 00:04:41,930 There we go, and let's go ahead 102 00:04:41,930 --> 00:04:43,740 and change this one, and then I'll show you 103 00:04:43,740 --> 00:04:44,970 what is happening here. 104 00:04:44,970 --> 00:04:49,970 So to start off with, I am selecting everything, 105 00:04:50,120 --> 00:04:53,390 and the INTO is actually where we're sending it to. 106 00:04:53,390 --> 00:04:56,238 So what we're doing is we're taking all of our files 107 00:04:56,238 --> 00:05:00,878 that are in this testinput1 input that I created, 108 00:05:00,878 --> 00:05:03,017 and we are just simply going to pass 109 00:05:03,017 --> 00:05:06,683 all of those things through to TestOutput, 110 00:05:06,683 --> 00:05:08,730 which it's going to show you right here. 111 00:05:08,730 --> 00:05:10,060 And we haven't defined that yet, 112 00:05:10,060 --> 00:05:12,240 so we would need to create our output as well. 113 00:05:12,240 --> 00:05:14,270 But that is our very first query. 114 00:05:14,270 --> 00:05:19,270 Select everything, pass it from testinput1 into TestOutput. 115 00:05:20,155 --> 00:05:25,003 And let's go ahead and get rid of those capitals. 116 00:05:26,600 --> 00:05:31,180 There we go. And then, we're going to do another query 117 00:05:31,180 --> 00:05:34,700 and we're going to have a select statement here. 118 00:05:34,700 --> 00:05:37,231 And so what we're going to do is just do a simple count, 119 00:05:37,231 --> 00:05:40,370 and we're going to take our simple count again, 120 00:05:40,370 --> 00:05:45,370 and we are going to pass that into TestOutput2. 121 00:05:47,390 --> 00:05:49,723 So we're going to take this testinput1, 122 00:05:52,110 --> 00:05:54,280 here, just like we did at the top, 123 00:05:54,280 --> 00:05:58,490 and we're going to send it to testoutput2, 124 00:05:58,490 --> 00:06:01,033 while doing a simple summation statement. 125 00:06:02,290 --> 00:06:03,950 And then, what we're going to do as well 126 00:06:03,950 --> 00:06:07,740 is we're going to group by a tumbling window. 127 00:06:07,740 --> 00:06:08,810 So in the last lesson, 128 00:06:08,810 --> 00:06:10,760 I talked a little bit about windows 129 00:06:10,760 --> 00:06:13,340 and how it's important to look at your data, 130 00:06:13,340 --> 00:06:16,900 so we're going to be talking a lot more about windows. 131 00:06:16,900 --> 00:06:18,220 So what we're doing here with our window 132 00:06:18,220 --> 00:06:20,260 is we're defining a length of time 133 00:06:20,260 --> 00:06:21,960 and we're going to pass data through 134 00:06:21,960 --> 00:06:24,050 that length of time or that window, 135 00:06:24,050 --> 00:06:26,300 and we're going to use that to figure out 136 00:06:26,300 --> 00:06:29,510 where we should do our summation statements. 137 00:06:29,510 --> 00:06:31,660 Okay. So we'll talk more about that later, 138 00:06:31,660 --> 00:06:33,796 but it's a very important concept to think 139 00:06:33,796 --> 00:06:36,596 about your data as it passes through time, 140 00:06:36,596 --> 00:06:40,020 and figure out what your windows are going to look like 141 00:06:40,020 --> 00:06:41,840 and how you're going to aggregate your data 142 00:06:41,840 --> 00:06:44,250 so that you can run your queries. 143 00:06:44,250 --> 00:06:47,020 All right. So this is a super simple example 144 00:06:47,020 --> 00:06:50,480 of a query in Azure Stream Analytics. 145 00:06:50,480 --> 00:06:52,500 And what I would simply do is save that, 146 00:06:52,500 --> 00:06:54,010 and then I could test my query. 147 00:06:54,010 --> 00:06:55,080 I'm not going to test it now, 148 00:06:55,080 --> 00:06:57,290 because we didn't finish our input. 149 00:06:57,290 --> 00:06:58,480 And then the final thing is 150 00:06:58,480 --> 00:07:02,630 is we would come in and we can add our outputs as well. 151 00:07:02,630 --> 00:07:04,950 So I could choose Cosmos DB, for example, 152 00:07:04,950 --> 00:07:09,053 and I could just say that this is testoutput1. 153 00:07:09,940 --> 00:07:13,030 And I could have my account here, 154 00:07:13,030 --> 00:07:17,080 and this is a Cosmos account that I created earlier. 155 00:07:17,080 --> 00:07:20,650 So it's just going to go ahead and add our test output. 156 00:07:20,650 --> 00:07:23,340 And so when we look at Azure Stream Analytics, 157 00:07:23,340 --> 00:07:24,740 it's really that simple. 158 00:07:24,740 --> 00:07:26,590 We're going to think about our input, 159 00:07:26,590 --> 00:07:29,110 our query, and then our output. 160 00:07:29,110 --> 00:07:31,760 And so you can see here, we now have our testoutput1, 161 00:07:32,703 --> 00:07:34,080 so I could go ahead 162 00:07:34,080 --> 00:07:38,240 and update my query to match my testoutput1. 163 00:07:38,240 --> 00:07:40,033 So I could say testoutput1. 164 00:07:41,150 --> 00:07:42,290 There we go, got rid of that, 165 00:07:42,290 --> 00:07:45,310 and now, it's going to be sending it to that Cosmos DB. 166 00:07:45,310 --> 00:07:46,530 So with that, I'm going to go ahead 167 00:07:46,530 --> 00:07:48,442 and jump back into our lesson 168 00:07:48,442 --> 00:07:51,970 and run through a couple of quick reviews. 169 00:07:51,970 --> 00:07:54,440 So first, stream processing. 170 00:07:54,440 --> 00:07:55,350 As you can see here, 171 00:07:55,350 --> 00:07:58,600 it's really all about our input, query, and output. 172 00:07:58,600 --> 00:07:59,940 So keep that in mind as you think 173 00:07:59,940 --> 00:08:03,200 about designing for Stream Analytics. 174 00:08:03,200 --> 00:08:05,910 You also need to understand how it works. 175 00:08:05,910 --> 00:08:08,155 And I'm going to have quite a few more demos, 176 00:08:08,155 --> 00:08:09,744 as well as some labs, 177 00:08:09,744 --> 00:08:12,290 and so you need to make sure that you understand 178 00:08:12,290 --> 00:08:14,610 and are comfortable with Azure Stream Analytics, 179 00:08:14,610 --> 00:08:18,143 because it is a very important concept on the DP-203. 180 00:08:19,910 --> 00:08:23,770 Finally, windowing. Windowing definitely matters. 181 00:08:23,770 --> 00:08:25,550 And in the next lesson, we're going to dive in 182 00:08:25,550 --> 00:08:28,300 and talk a lot about the different types of windows 183 00:08:28,300 --> 00:08:29,680 that you can encounter, 184 00:08:29,680 --> 00:08:31,157 but make sure as you look at design, 185 00:08:31,157 --> 00:08:33,066 that you're considering windowing 186 00:08:33,066 --> 00:08:37,403 and how you want to handle queries of your data. 187 00:08:39,080 --> 00:08:41,140 All right, with that, I'm going to go ahead, end this 188 00:08:41,140 --> 00:08:42,600 lesson, and we'll pick up in the next 189 00:08:42,600 --> 00:08:45,150 and talk a lot more about windows. 190 00:08:45,150 --> 00:08:46,100 I'll see you there.