1 00:00:00,180 --> 00:00:01,170 Hey, what's up, Gurus? 2 00:00:01,170 --> 00:00:05,770 Welcome to Section 4: Data Ingestion and Transformation. 3 00:00:05,770 --> 00:00:07,550 Now, if you'll notice down here at the bottom, 4 00:00:07,550 --> 00:00:10,660 you'll see that Landon is on here, as well as myself, 5 00:00:10,660 --> 00:00:12,920 and we're going to kind of work on this section together. 6 00:00:12,920 --> 00:00:14,600 Landon's going to do most of the lessons. 7 00:00:14,600 --> 00:00:16,600 I'm going to jump in a little bit as well. 8 00:00:16,600 --> 00:00:17,720 So just keep that in mind. 9 00:00:17,720 --> 00:00:18,970 We'll see a little bit of change, 10 00:00:18,970 --> 00:00:22,470 but I wanted to introduce us both for this section. 11 00:00:22,470 --> 00:00:23,620 Now with that, let's get started, 12 00:00:23,620 --> 00:00:25,670 and talk about section 4. 13 00:00:25,670 --> 00:00:29,350 If you'll notice, section 4 is just about 1/3 14 00:00:29,350 --> 00:00:31,130 of the way through this course. 15 00:00:31,130 --> 00:00:33,610 So make sure you look at your schedule, 16 00:00:33,610 --> 00:00:35,550 make sure that you're on track, 17 00:00:35,550 --> 00:00:37,500 and have a little celebration 18 00:00:37,500 --> 00:00:38,730 because when you finish this-- 19 00:00:38,730 --> 00:00:40,590 hey--you're a 1/3 of the way there. 20 00:00:40,590 --> 00:00:42,240 That's awesome! 21 00:00:42,240 --> 00:00:44,840 In this section, we are going to be breaking down, 22 00:00:44,840 --> 00:00:46,300 of course, our introduction. 23 00:00:46,300 --> 00:00:48,410 We're going to talk about Azure Data Factory 24 00:00:48,410 --> 00:00:49,350 just a little bit. 25 00:00:49,350 --> 00:00:52,580 We actually have almost an entire section coming up on that, 26 00:00:52,580 --> 00:00:55,050 but we'll kind of introduce that as well. 27 00:00:55,050 --> 00:00:57,420 We're going to talk about Transact-SQL. 28 00:00:57,420 --> 00:00:59,820 We're going to talk about Azure Synapse Pipelines. 29 00:00:59,820 --> 00:01:02,630 And again, this will be just a little bit of a deeper dive, 30 00:01:02,630 --> 00:01:05,120 and we're going to re-introduce a lot of these services 31 00:01:05,120 --> 00:01:06,230 over and over again, 32 00:01:06,230 --> 00:01:09,750 teaching new concepts to kind of further what you can do 33 00:01:09,750 --> 00:01:12,170 with them as you move through this course. 34 00:01:12,170 --> 00:01:15,140 We'll talk about Scala, Apache Spark, 35 00:01:15,140 --> 00:01:18,363 and we'll talk quite a bit about creating data pipelines. 36 00:01:19,800 --> 00:01:22,210 In addition to that, we're going to talk about designing 37 00:01:22,210 --> 00:01:24,120 and creating tests for data pipelines, 38 00:01:24,120 --> 00:01:26,520 integrating Jupiter and Python notebooks 39 00:01:26,520 --> 00:01:27,830 into data pipelines. 40 00:01:27,830 --> 00:01:31,007 Cleaning data, splitting data, shredding JSON, 41 00:01:31,007 --> 00:01:32,410 and what that means. 42 00:01:32,410 --> 00:01:33,910 And then of course, we're going to talk about 43 00:01:33,910 --> 00:01:36,370 how we encode and decode data, 44 00:01:36,370 --> 00:01:37,500 and we're going to talk about 45 00:01:37,500 --> 00:01:40,363 configuring error handling for transformations. 46 00:01:41,670 --> 00:01:44,010 We will also talk about normalizing 47 00:01:44,010 --> 00:01:45,930 and denormalizing values, 48 00:01:45,930 --> 00:01:49,000 performing data exploratory analysis, 49 00:01:49,000 --> 00:01:51,870 and finally, our section recap. 50 00:01:51,870 --> 00:01:55,470 So, if you think there's a lot of stuff in this section, 51 00:01:55,470 --> 00:01:56,750 you are correct! 52 00:01:56,750 --> 00:01:59,900 There is a ton of information in this section, 53 00:01:59,900 --> 00:02:01,750 a ton of different concepts, 54 00:02:01,750 --> 00:02:05,610 but they all tie into data ingestion and transformation. 55 00:02:05,610 --> 00:02:08,100 As we talk through those services, keep in mind, 56 00:02:08,100 --> 00:02:11,340 don't get too hung up on the services themselves, 57 00:02:11,340 --> 00:02:13,660 because we'll be going back over those services 58 00:02:13,660 --> 00:02:15,020 throughout the course, 59 00:02:15,020 --> 00:02:17,523 as we've reinforced these concepts. 60 00:02:19,050 --> 00:02:20,270 Finally, don't forget, 61 00:02:20,270 --> 00:02:22,550 we do have a hands-on lab in the section: 62 00:02:22,550 --> 00:02:24,960 Moving and Transforming Data with Azure Data Factory. 63 00:02:24,960 --> 00:02:26,570 So, make sure that you tune into that, 64 00:02:26,570 --> 00:02:27,880 because that is going to be something 65 00:02:27,880 --> 00:02:29,830 that you definitely want to understand. 66 00:02:31,280 --> 00:02:33,340 So as we think about this section, 67 00:02:33,340 --> 00:02:35,610 Data Ingestion and Transformation 68 00:02:35,610 --> 00:02:37,770 is all about taking raw materials 69 00:02:37,770 --> 00:02:39,810 and transforming it into usable data 70 00:02:39,810 --> 00:02:41,610 for downstream consumption. 71 00:02:41,610 --> 00:02:43,860 So in this case, we're going to be talking about 72 00:02:43,860 --> 00:02:45,550 business decision-making. 73 00:02:45,550 --> 00:02:47,340 So this data is incredibly important 74 00:02:47,340 --> 00:02:49,870 because it's going to shape the vision for the business 75 00:02:49,870 --> 00:02:51,400 as it moves forward. 76 00:02:51,400 --> 00:02:54,250 And so really, it's all about taking that coal, 77 00:02:54,250 --> 00:02:56,280 compressing it down into something that's usable, 78 00:02:56,280 --> 00:02:59,240 and turning it into something beautiful, like a diamond. 79 00:02:59,240 --> 00:03:00,073 All right, with that, 80 00:03:00,073 --> 00:03:01,890 we're done with this introduction. 81 00:03:01,890 --> 00:03:03,260 I'm going to turn you over to Landon, 82 00:03:03,260 --> 00:03:05,260 and I will jump back in to see you later 83 00:03:05,260 --> 00:03:06,780 for the section recap. 84 00:03:06,780 --> 00:03:07,693 Enjoy the section.