1 00:00:00,450 --> 00:00:04,660 Hello, and welcome to Managing Data Pipelines. 2 00:00:04,660 --> 00:00:08,170 This lesson is going to be a bit of a kitchen sink lesson, 3 00:00:08,170 --> 00:00:10,750 and in that, I mean, we are going to take 4 00:00:10,750 --> 00:00:12,420 some of the remaining concepts 5 00:00:12,420 --> 00:00:15,010 that we haven't talked about with Azure Data Factory 6 00:00:15,010 --> 00:00:17,070 and we're going to be going through those. 7 00:00:17,070 --> 00:00:19,310 Specifically, we're going to be focusing on 8 00:00:19,310 --> 00:00:21,420 monitoring pipeline and trigger runs, 9 00:00:21,420 --> 00:00:24,460 and then talking about integration runtime, 10 00:00:24,460 --> 00:00:26,660 including an understanding of the basics 11 00:00:26,660 --> 00:00:29,860 of what integration runtime actually is. 12 00:00:29,860 --> 00:00:31,370 And then finally, we'll wrap up 13 00:00:31,370 --> 00:00:34,150 by talking about how to set notifications 14 00:00:34,150 --> 00:00:37,160 and understand the basics of what that is as well, 15 00:00:37,160 --> 00:00:38,400 and we'll just touch on that lightly 16 00:00:38,400 --> 00:00:39,870 because we've talked about it a little bit 17 00:00:39,870 --> 00:00:41,370 in some other lessons as well. 18 00:00:42,470 --> 00:00:44,650 Now, one thing to keep in mind 19 00:00:44,650 --> 00:00:48,360 for the Azure Data Factory lessons in this section, 20 00:00:48,360 --> 00:00:51,170 almost all of those apply to Data Factory, 21 00:00:51,170 --> 00:00:53,110 as well as to Synapse, 22 00:00:53,110 --> 00:00:55,400 and I'll show you the Synapse version as well 23 00:00:55,400 --> 00:00:57,320 and we'll kind of talk about the differences there. 24 00:00:57,320 --> 00:00:59,640 So keep in mind, as we talk about pipelines 25 00:00:59,640 --> 00:01:01,040 in Azure Data Factory, 26 00:01:01,040 --> 00:01:03,350 most of those concepts are going to port over 27 00:01:03,350 --> 00:01:05,840 to Azure Synapse as well. 28 00:01:05,840 --> 00:01:08,640 All right. So with that introduction, let's get started. 29 00:01:11,620 --> 00:01:12,453 All right. 30 00:01:12,453 --> 00:01:16,130 So we're going to be looking at the monitor section 31 00:01:16,130 --> 00:01:17,530 in Azure Data Factory, 32 00:01:17,530 --> 00:01:19,830 and there's 4 main components to this. 33 00:01:19,830 --> 00:01:21,750 First is dashboards. 34 00:01:21,750 --> 00:01:24,810 And so we'll talk about a quick overview of Data Factory, 35 00:01:24,810 --> 00:01:26,070 and basically that gives you, 36 00:01:26,070 --> 00:01:29,650 your 50,000-foot view of what's going on 37 00:01:29,650 --> 00:01:31,390 so that you can see if there's any alerts, 38 00:01:31,390 --> 00:01:33,530 and we'll see what that looks like. 39 00:01:33,530 --> 00:01:37,450 Next up, you have runs, and runs is going to be 40 00:01:37,450 --> 00:01:40,650 a breakdown by detailed activity, 41 00:01:40,650 --> 00:01:45,370 as well as detailing failure points within your pipelines. 42 00:01:45,370 --> 00:01:48,160 And we can also break that down and look at trigger runs, 43 00:01:48,160 --> 00:01:51,113 as well as just general pipeline runs as well. 44 00:01:52,190 --> 00:01:54,120 So runtimes and sessions. 45 00:01:54,120 --> 00:01:57,100 So this is going to include integration runtimes 46 00:01:57,100 --> 00:02:00,750 and data flow debug, and this is a good time to talk about 47 00:02:00,750 --> 00:02:03,493 the basics of what integration runtime is. 48 00:02:04,780 --> 00:02:07,490 Integration runtime is just the compute infrastructure 49 00:02:07,490 --> 00:02:09,590 that's used by Azure Data Factory 50 00:02:09,590 --> 00:02:12,000 and Azure Synapse pipelines. 51 00:02:12,000 --> 00:02:14,170 So whenever you use data flow 52 00:02:14,170 --> 00:02:17,230 or you're using your copy data for data movement, 53 00:02:17,230 --> 00:02:21,470 or you are running activities, activity dispatch, 54 00:02:21,470 --> 00:02:23,690 so for using things like Azure Databricks 55 00:02:23,690 --> 00:02:25,670 or HDInsight activities, 56 00:02:25,670 --> 00:02:27,220 that would be something that would be used 57 00:02:27,220 --> 00:02:28,970 by integration runtime. 58 00:02:28,970 --> 00:02:31,310 So when you think about integration runtime, 59 00:02:31,310 --> 00:02:34,030 just think about that being a compute 60 00:02:34,030 --> 00:02:37,823 that is used for Data Factory and Synapse. 61 00:02:39,190 --> 00:02:40,730 So then finally, debug mode. 62 00:02:40,730 --> 00:02:44,900 This is just a data preview mode that we can use 63 00:02:44,900 --> 00:02:48,110 in data flow for Azure Data Factory, 64 00:02:48,110 --> 00:02:50,340 and allows us to do some end-to-end testing 65 00:02:50,340 --> 00:02:52,313 on our data flows. 66 00:02:53,520 --> 00:02:55,640 And then, last but not least, notifications. 67 00:02:55,640 --> 00:02:57,460 So we can set up alerts and metrics, 68 00:02:57,460 --> 00:02:59,160 and we've talked about that in some other lessons 69 00:02:59,160 --> 00:03:01,210 so we're not going to spend a ton of time there, 70 00:03:01,210 --> 00:03:03,790 but that is also a component as well, 71 00:03:03,790 --> 00:03:05,090 and I'll show you that. 72 00:03:05,090 --> 00:03:06,840 And let's go ahead and actually do that now. 73 00:03:06,840 --> 00:03:08,340 I'll jump over into the portal 74 00:03:08,340 --> 00:03:11,200 and let's talk through some of these concepts. 75 00:03:11,200 --> 00:03:13,970 So here we find ourselves in the Monitor section 76 00:03:13,970 --> 00:03:17,130 of Data Factory, and you can see here at the top, 77 00:03:17,130 --> 00:03:20,210 we have our dashboards, which is the default view, 78 00:03:20,210 --> 00:03:21,720 and I have intentionally 79 00:03:22,970 --> 00:03:24,700 broken a pipeline for you 80 00:03:24,700 --> 00:03:26,470 so you can see what happens here. 81 00:03:26,470 --> 00:03:27,650 It's going to give you a list 82 00:03:27,650 --> 00:03:29,410 of all the pipelines that have run. 83 00:03:29,410 --> 00:03:32,980 You can see a pie chart breakdown, or a line chart, 84 00:03:32,980 --> 00:03:35,270 and you can also see activities as well. 85 00:03:35,270 --> 00:03:36,570 So if we have activities, 86 00:03:36,570 --> 00:03:38,730 it's going to show you the failure for those, 87 00:03:38,730 --> 00:03:40,483 as well as for the pipeline. 88 00:03:41,530 --> 00:03:44,750 And so you can see here that we have a pipeline 89 00:03:44,750 --> 00:03:47,570 that's got some issues, and if I go ahead and click on that, 90 00:03:47,570 --> 00:03:51,530 it's going to take me down one section to our runs 91 00:03:51,530 --> 00:03:54,090 and it's going to show me my pipeline runs. 92 00:03:54,090 --> 00:03:57,030 So I can actually see that my pipeline failed, 93 00:03:57,030 --> 00:03:59,260 and if I click on Error, it's actually going to give me 94 00:03:59,260 --> 00:04:01,090 exactly what happened. 95 00:04:01,090 --> 00:04:06,020 So you can see here that I have a missing data table, 96 00:04:06,020 --> 00:04:09,710 and so it's not copying the data like it's supposed to. 97 00:04:09,710 --> 00:04:12,523 And so, on our manual trigger, we had a failure. 98 00:04:13,460 --> 00:04:16,860 So that is how we can see our pipeline runs, 99 00:04:16,860 --> 00:04:18,920 and that's also our dashboards. 100 00:04:18,920 --> 00:04:20,180 And then, while we're in runs, 101 00:04:20,180 --> 00:04:22,470 we can go ahead and click on Trigger here. 102 00:04:22,470 --> 00:04:25,440 And if I had scheduled triggers, I can also see that. 103 00:04:25,440 --> 00:04:28,780 So I can see my clock, my wall clock scheduled triggers. 104 00:04:28,780 --> 00:04:30,300 We can see my tumbling window. 105 00:04:30,300 --> 00:04:31,890 We can see my storage events, 106 00:04:31,890 --> 00:04:33,160 and then if I had any custom events, 107 00:04:33,160 --> 00:04:34,710 we could see those as well. 108 00:04:34,710 --> 00:04:36,990 So you can break down all of your trigger runs, 109 00:04:36,990 --> 00:04:38,650 and it's going to look just like the pipeline. 110 00:04:38,650 --> 00:04:41,560 It's going to show you what succeeded and what failed. 111 00:04:41,560 --> 00:04:43,480 And of course, you could always click on it. 112 00:04:43,480 --> 00:04:44,710 You can try and rerun it. 113 00:04:44,710 --> 00:04:47,803 You can see more information if you need to. 114 00:04:49,780 --> 00:04:52,800 Next up, we have our integration runtimes. 115 00:04:52,800 --> 00:04:56,200 And so you can see here that it has spun up and is running 116 00:04:56,200 --> 00:04:59,330 an autoresolve integration runtime. 117 00:04:59,330 --> 00:05:01,640 Now, when we do integration runtime, 118 00:05:01,640 --> 00:05:03,900 there's 3 different types that we can use. 119 00:05:03,900 --> 00:05:08,850 There's Azure, self-hosted, and something called Azure SSIS. 120 00:05:08,850 --> 00:05:12,800 So, most of it is going to actually be just standard Azure, 121 00:05:12,800 --> 00:05:17,300 and this is going to be something that is managed by Azure. 122 00:05:17,300 --> 00:05:19,580 In most cases, it's going to spin up or spin down 123 00:05:19,580 --> 00:05:21,250 so we don't have to mess with that. 124 00:05:21,250 --> 00:05:23,810 And you can see by clicking on Azure here, 125 00:05:23,810 --> 00:05:25,202 that's where that lives. 126 00:05:25,202 --> 00:05:26,468 Obviously for this course, 127 00:05:26,468 --> 00:05:28,126 we're going to be using just Azure. 128 00:05:28,126 --> 00:05:32,500 So we aren't going to have a need for SSIS or self-hosted. 129 00:05:32,500 --> 00:05:35,070 That would most likely be covered in another exam. 130 00:05:35,070 --> 00:05:38,300 So for the DP-203, I'd really just focus on the Azure 131 00:05:38,300 --> 00:05:42,233 and understanding what integration runtimes actually is. 132 00:05:43,640 --> 00:05:45,860 I talked about Data Flow Debug as well. 133 00:05:45,860 --> 00:05:47,900 Finally, we have Alerts and Metrics. 134 00:05:47,900 --> 00:05:49,900 So you can see here, we don't have any alerts. 135 00:05:49,900 --> 00:05:52,730 If I wanted to create one, I could just click on that 136 00:05:52,730 --> 00:05:54,250 and go through alert rules, 137 00:05:54,250 --> 00:05:56,440 and we actually cover that in another lesson. 138 00:05:56,440 --> 00:06:01,170 So I'm not going to focus on alert rules in this lesson, 139 00:06:02,090 --> 00:06:05,190 but that's where you would find your alert metrics. 140 00:06:05,190 --> 00:06:06,500 And then, last but not least, 141 00:06:06,500 --> 00:06:09,020 before we jump back in and wrap everything up, 142 00:06:09,020 --> 00:06:12,860 I have also opened up an Azure Synapse Analytics instance, 143 00:06:12,860 --> 00:06:16,230 and I pulled up the exact same Monitor section. 144 00:06:16,230 --> 00:06:18,530 And you can see here, it is very, very similar. 145 00:06:18,530 --> 00:06:21,700 So we have our integration with pipeline and trigger runs. 146 00:06:21,700 --> 00:06:23,283 We have Activities, 147 00:06:23,283 --> 00:06:25,916 and we have SQL Pools or Analytics Pools, 148 00:06:25,916 --> 00:06:29,160 which is going to be something specific to Synapse. 149 00:06:29,160 --> 00:06:31,050 So it's actually very, very similar, 150 00:06:31,050 --> 00:06:34,370 and you can see even the layout here is very similar 151 00:06:34,370 --> 00:06:36,303 between these 2 instances. 152 00:06:37,630 --> 00:06:40,190 So keep in mind, that is, the concepts 153 00:06:40,190 --> 00:06:42,490 we talk about for Data Factory could also apply 154 00:06:42,490 --> 00:06:45,380 to Synapse on the exam. 155 00:06:45,380 --> 00:06:48,973 All right. With that, let's jump in and do a review. 156 00:06:50,380 --> 00:06:53,310 First off, I want to point out the facts on the right here, 157 00:06:53,310 --> 00:06:57,310 you are at the very end of a very long section. 158 00:06:57,310 --> 00:06:59,390 Congratulations, you have been doing awesome, 159 00:06:59,390 --> 00:07:01,250 and not only that, 160 00:07:01,250 --> 00:07:05,610 this is a good milestone that marks near the halfway point 161 00:07:05,610 --> 00:07:06,810 for this course. 162 00:07:06,810 --> 00:07:09,410 So, congratulations there as well. 163 00:07:09,410 --> 00:07:12,490 For the review, you just need to know what's possible. 164 00:07:12,490 --> 00:07:13,690 So we talked about the portal, 165 00:07:13,690 --> 00:07:16,130 so you were able to see the layout in the portal. 166 00:07:16,130 --> 00:07:19,560 Keep that in mind. That's going to be important for you. 167 00:07:19,560 --> 00:07:22,670 Make sure that you have the basics of integration runtime, 168 00:07:22,670 --> 00:07:24,640 again, just compute infrastructure 169 00:07:24,640 --> 00:07:26,790 for Data Factory and Synapse. 170 00:07:26,790 --> 00:07:28,190 It's probably going to be enough, 171 00:07:28,190 --> 00:07:30,770 but you can also go in and just go a little bit further 172 00:07:30,770 --> 00:07:35,580 by talking about self-hosted, Azure, and SSIS. 173 00:07:35,580 --> 00:07:37,540 If you've got those 2 things down, 174 00:07:37,540 --> 00:07:39,970 the last bit is just notifications. 175 00:07:39,970 --> 00:07:42,590 If you want an alert email on a failed run, 176 00:07:42,590 --> 00:07:43,523 can you do that? 177 00:07:44,550 --> 00:07:47,060 You should be saying yes, you can do that 178 00:07:47,060 --> 00:07:51,393 by going into that monitor tab and choosing a new alert. 179 00:07:53,420 --> 00:07:55,600 If you've got that down, you're good to go 180 00:07:55,600 --> 00:07:57,513 and I'll see you in the next lesson.