1 00:00:00,520 --> 00:00:03,310 All right, it is time for the lesson 2 00:00:03,310 --> 00:00:05,930 on role-based access control. 3 00:00:05,930 --> 00:00:07,140 In this lesson, we're going to talk 4 00:00:07,140 --> 00:00:10,470 about how we secure Azure Data Lake through RBAC, 5 00:00:10,470 --> 00:00:12,410 or role-based access control, 6 00:00:12,410 --> 00:00:15,950 and access control lists, or ACLs. 7 00:00:15,950 --> 00:00:19,030 Now, if you have had any certification, 8 00:00:19,030 --> 00:00:21,530 if you've done the AZ-900, the DP-900, 9 00:00:21,530 --> 00:00:24,760 or any Azure certification, you should have some familiarity 10 00:00:24,760 --> 00:00:28,520 with role-based access control. But we'll do a quick review 11 00:00:28,520 --> 00:00:30,350 so that you understand the basics, 12 00:00:30,350 --> 00:00:31,440 and then we'll jump in 13 00:00:31,440 --> 00:00:34,910 and we'll talk about access control lists, or ACLs, 14 00:00:34,910 --> 00:00:38,090 which you probably have a little bit less familiarity with. 15 00:00:38,090 --> 00:00:41,890 But we'll talk about what you need to know for the DP-203. 16 00:00:41,890 --> 00:00:43,620 And then of course, we'll finish up 17 00:00:43,620 --> 00:00:45,800 by jumping into the portal, and I'll show you 18 00:00:45,800 --> 00:00:50,210 where you would be able to find role-based access control. 19 00:00:50,210 --> 00:00:52,750 So with that, let's talk about RBAC. 20 00:00:53,760 --> 00:00:56,730 It all starts with a security principal. 21 00:00:56,730 --> 00:01:01,610 Now, a security principal is just a representation. 22 00:01:01,610 --> 00:01:03,420 It's the gatekeeper, if you will. 23 00:01:03,420 --> 00:01:06,780 And basically, what happens is it requires 24 00:01:06,780 --> 00:01:10,650 that any users or service principals 25 00:01:10,650 --> 00:01:14,920 have to go through this security process. 26 00:01:14,920 --> 00:01:17,510 So it all starts with this security principal. 27 00:01:17,510 --> 00:01:19,460 So this is going to be a guest, 28 00:01:19,460 --> 00:01:24,460 or an employee, or an application, a service principal 29 00:01:24,530 --> 00:01:28,820 that's trying to access another resource within Azure. 30 00:01:28,820 --> 00:01:31,670 Okay? So it starts with that security principal. 31 00:01:31,670 --> 00:01:34,790 Then, we talk about a role definition. 32 00:01:34,790 --> 00:01:37,780 So now that we know who's trying to access, 33 00:01:37,780 --> 00:01:41,370 we need to talk about what it is that we want them to do. 34 00:01:41,370 --> 00:01:44,560 So are they an owner? Are they a contributor? 35 00:01:44,560 --> 00:01:47,900 Can they only read things but not edit anything? 36 00:01:47,900 --> 00:01:50,880 Can they do nothing or next to nothing? 37 00:01:50,880 --> 00:01:52,960 So this is the role definition, 38 00:01:52,960 --> 00:01:55,920 and it defines what the security principal, 39 00:01:55,920 --> 00:01:59,780 or the user in step 1, is able to do. 40 00:01:59,780 --> 00:02:02,070 The third piece is the scope. 41 00:02:02,070 --> 00:02:04,890 So, where can they do this thing? 42 00:02:04,890 --> 00:02:08,470 So, it may only apply to Azure Synapse, 43 00:02:08,470 --> 00:02:11,440 or it may only apply to a SQL Database, 44 00:02:11,440 --> 00:02:14,430 or it might apply to an entire resource group. 45 00:02:14,430 --> 00:02:17,470 So what we do is we take that security principal, 46 00:02:17,470 --> 00:02:20,140 and that role definition, and that scope, 47 00:02:20,140 --> 00:02:22,580 and we marry all 3 of those together 48 00:02:22,580 --> 00:02:25,260 into a role assignment. 49 00:02:25,260 --> 00:02:28,750 So this role assignment, which is basically RBAC, 50 00:02:28,750 --> 00:02:33,750 role-based access control, allows us to identify a user, 51 00:02:33,840 --> 00:02:37,630 then identify what they should be able to do, 52 00:02:37,630 --> 00:02:40,340 and then define where they should be able to do that, 53 00:02:40,340 --> 00:02:44,273 and put all of that together into RBAC. 54 00:02:45,830 --> 00:02:49,210 So, with that foundation, let's talk a little bit 55 00:02:49,210 --> 00:02:51,720 about access control lists. 56 00:02:51,720 --> 00:02:54,810 So access control lists are simply a way 57 00:02:54,810 --> 00:02:59,810 to give more detailed or a finer-grain access 58 00:03:00,230 --> 00:03:03,380 to directories and files that you may have. 59 00:03:03,380 --> 00:03:06,710 Now, it is important to note before we go too far into this, 60 00:03:06,710 --> 00:03:09,560 the way that ACLs are evaluated. 61 00:03:09,560 --> 00:03:12,460 So, we start off with an RBAC. 62 00:03:12,460 --> 00:03:15,090 That is the primary assignment. 63 00:03:15,090 --> 00:03:19,180 So if you grant access to something in RBAC, 64 00:03:19,180 --> 00:03:21,860 then access has been granted. 65 00:03:21,860 --> 00:03:24,680 If you don't give access to something in RBAC, 66 00:03:24,680 --> 00:03:26,800 then we'll look at access control lists 67 00:03:26,800 --> 00:03:29,770 and decide if, "okay, well you still can have access". 68 00:03:29,770 --> 00:03:32,140 So at a high level, 69 00:03:32,140 --> 00:03:34,840 we could say for RBAC, you don't have permission 70 00:03:34,840 --> 00:03:38,200 to access anything in SQL Database. 71 00:03:38,200 --> 00:03:40,370 However, in our access control list, 72 00:03:40,370 --> 00:03:41,950 but we could say that you have access 73 00:03:41,950 --> 00:03:44,410 to this 1 folder or 1 file. 74 00:03:44,410 --> 00:03:48,850 So it's a little bit of a finer grain than RBAC is. 75 00:03:48,850 --> 00:03:52,810 Now, access control lists basically associate 76 00:03:52,810 --> 00:03:56,240 a security principal with an access level. 77 00:03:56,240 --> 00:03:59,110 So we talked the first step about RBAC, 78 00:03:59,110 --> 00:04:00,550 and that's how we identify the user 79 00:04:00,550 --> 00:04:02,880 or the application that's trying to get access, 80 00:04:02,880 --> 00:04:05,550 and then we marry that with an access level 81 00:04:05,550 --> 00:04:07,400 to decide whether or not you should be able 82 00:04:07,400 --> 00:04:10,270 to access this file or this folder, whatever. 83 00:04:10,270 --> 00:04:14,748 Now, with ACL, each file or directory in a 84 00:04:14,748 --> 00:04:18,610 Blob Storage is going to have its own ACL. 85 00:04:18,610 --> 00:04:19,830 And so we can go in there 86 00:04:19,830 --> 00:04:24,580 and we can determine access for each file or directory. 87 00:04:24,580 --> 00:04:26,470 And we can programmatically set this, 88 00:04:26,470 --> 00:04:29,120 or we can also set this in the portal. 89 00:04:29,120 --> 00:04:30,830 And there's a couple of different permissions. 90 00:04:30,830 --> 00:04:32,210 And I actually wanted to pull this up 91 00:04:32,210 --> 00:04:35,910 to show you kind of the shorthand of what this looks like. 92 00:04:35,910 --> 00:04:37,030 So, we have 93 00:04:37,030 --> 00:04:38,920 RWX 94 00:04:38,920 --> 00:04:40,780 and dash. 95 00:04:40,780 --> 00:04:43,320 The R stands for read. 96 00:04:43,320 --> 00:04:46,900 So if I have R permission, that's read only. 97 00:04:46,900 --> 00:04:51,900 If I have W, that is write, so I can write to a file. 98 00:04:52,280 --> 00:04:55,310 And then the X is execute. 99 00:04:55,310 --> 00:04:59,420 So, this was something that was stood up for Gen1. 100 00:04:59,420 --> 00:05:01,440 If we're talking about a Data Lake Gen2, 101 00:05:01,440 --> 00:05:05,683 it really means nothing, but the X stands for execute. 102 00:05:06,680 --> 00:05:10,050 And the execute in the older version would just allow you 103 00:05:10,050 --> 00:05:15,050 to move through child folders within a larger directory. 104 00:05:15,370 --> 00:05:17,810 But that's what RWX is. 105 00:05:17,810 --> 00:05:19,900 Now, the R and the W, obviously, 106 00:05:19,900 --> 00:05:21,510 are going to be the most utilized. 107 00:05:21,510 --> 00:05:23,010 And then finally we have the dash, 108 00:05:23,010 --> 00:05:25,090 the dash just means you don't have access. 109 00:05:25,090 --> 00:05:26,830 So you can see here from this permission list, 110 00:05:26,830 --> 00:05:28,210 the very top would give you access 111 00:05:28,210 --> 00:05:30,560 to read, write, and traverse. 112 00:05:30,560 --> 00:05:34,050 The one below that, the R-X, would only give you permission 113 00:05:34,050 --> 00:05:36,640 to read or go through subdirectories. 114 00:05:36,640 --> 00:05:38,590 The one below that would only give you permission 115 00:05:38,590 --> 00:05:40,940 to read things in the primary directory, 116 00:05:40,940 --> 00:05:42,950 but not any subdirectories. 117 00:05:42,950 --> 00:05:45,110 And then the bottom one, the ---, 118 00:05:45,110 --> 00:05:49,070 gives you absolutely zero permission to do anything. 119 00:05:49,070 --> 00:05:50,900 So that's the shorthand version. 120 00:05:50,900 --> 00:05:52,720 You'll see that when we jump into the portal 121 00:05:52,720 --> 00:05:55,580 for RBAC as well, the read and write, 122 00:05:55,580 --> 00:05:57,020 but that is the shorthand version 123 00:05:57,020 --> 00:05:59,260 when you're talking about ACLs. 124 00:05:59,260 --> 00:06:01,420 So with that, we should have the basics down 125 00:06:01,420 --> 00:06:02,890 for RBAC and ACLs, 126 00:06:02,890 --> 00:06:05,890 at least as far as the DP-203 is concerned. 127 00:06:05,890 --> 00:06:07,900 So let's go ahead and jump over 128 00:06:07,900 --> 00:06:11,463 into the portal and take a look at RBAC and ACL. 129 00:06:12,830 --> 00:06:14,970 Okay, so now we find ourself 130 00:06:14,970 --> 00:06:18,330 in the Azure portal in a storage account. 131 00:06:18,330 --> 00:06:21,240 And what I've done is I've clicked on Access Control, 132 00:06:21,240 --> 00:06:25,230 or IAM, which is identity access management. 133 00:06:25,230 --> 00:06:28,830 And within here, I can actually work with RBAC. 134 00:06:28,830 --> 00:06:31,440 So we talked about roles, for instance. 135 00:06:31,440 --> 00:06:32,780 So I can click on Roles here, 136 00:06:32,780 --> 00:06:34,630 and you can see that there are a whole lot 137 00:06:34,630 --> 00:06:37,280 of roles that are already configured. 138 00:06:37,280 --> 00:06:41,260 And if I was to click on View here, again, 139 00:06:41,260 --> 00:06:46,260 I can also go to JSON and I can see a JSON version 140 00:06:46,410 --> 00:06:49,000 of what we were looking at just a second ago. 141 00:06:49,000 --> 00:06:53,150 So I can come in here and I can see that this is the role. 142 00:06:53,150 --> 00:06:56,030 This is the description of what the role is. 143 00:06:56,030 --> 00:06:58,430 And then you can see actions and things. 144 00:06:58,430 --> 00:07:00,280 So actions are things that I can do, 145 00:07:00,280 --> 00:07:02,650 notActions are things that I can't do. 146 00:07:02,650 --> 00:07:03,870 And then I can assign that 147 00:07:03,870 --> 00:07:05,330 to a whole bunch of different places, 148 00:07:05,330 --> 00:07:08,950 and then I can also assign that to users as well. 149 00:07:08,950 --> 00:07:13,950 So with all of that, that's how RBAC works in Azure. 150 00:07:14,530 --> 00:07:17,430 Let's also take a look at ACLs. 151 00:07:17,430 --> 00:07:20,993 So if I jump into one of my containers, for instance, 152 00:07:21,870 --> 00:07:24,560 I can click on this container 153 00:07:24,560 --> 00:07:27,100 and then this individual CSV file. 154 00:07:27,100 --> 00:07:29,450 And if I was to click on this over here on the right, 155 00:07:29,450 --> 00:07:31,350 the 3 dots, I can come down 156 00:07:31,350 --> 00:07:35,190 to Manage access control list, or Manage ACL, 157 00:07:35,190 --> 00:07:38,730 and you can see then that I can assign users 158 00:07:38,730 --> 00:07:42,400 the ability to read, write, and execute, that RWX. 159 00:07:42,400 --> 00:07:46,010 And then the nothing here, that would be our dashes. 160 00:07:46,010 --> 00:07:47,240 All right? 161 00:07:47,240 --> 00:07:48,073 That simple. 162 00:07:48,073 --> 00:07:50,470 I would just click on what I want, click on the Save button, 163 00:07:50,470 --> 00:07:53,250 and then that would give me access to do something 164 00:07:53,250 --> 00:07:55,053 within Azure. 165 00:07:56,830 --> 00:07:58,720 So a couple of key points to remember. 166 00:07:58,720 --> 00:08:01,410 First, RBAC is a core concept. 167 00:08:01,410 --> 00:08:03,460 This is something that you have to think about 168 00:08:03,460 --> 00:08:05,640 regardless of the service, whether you're talking 169 00:08:05,640 --> 00:08:09,050 about data, or operations, or it doesn't really matter. 170 00:08:09,050 --> 00:08:09,883 You're going to need 171 00:08:09,883 --> 00:08:13,830 to understand identity access management and RBAC. 172 00:08:13,830 --> 00:08:17,550 Role definitions should be a core cloud consideration. 173 00:08:17,550 --> 00:08:19,250 So this is not just for your employees, 174 00:08:19,250 --> 00:08:21,160 but also for your contractors. 175 00:08:21,160 --> 00:08:23,140 You need to think through the different types 176 00:08:23,140 --> 00:08:25,040 of roles that you're going to have, 177 00:08:25,040 --> 00:08:27,620 and figure out what they should have access to. 178 00:08:27,620 --> 00:08:30,270 This is a really easy way to manage security 179 00:08:30,270 --> 00:08:32,380 and manage who has access to what 180 00:08:32,380 --> 00:08:35,220 based upon role definitions. 181 00:08:35,220 --> 00:08:37,330 So with that, this is what you need to know 182 00:08:37,330 --> 00:08:39,160 for the DP-203. 183 00:08:39,160 --> 00:08:41,360 Now, we did a fairly high-level walkthrough 184 00:08:41,360 --> 00:08:43,210 because 1, this should be a review, 185 00:08:43,210 --> 00:08:46,240 and 2, this is a data engineering course, 186 00:08:46,240 --> 00:08:50,200 it is not a security course, nor a security certification. 187 00:08:50,200 --> 00:08:53,970 So for the DP-203, this should be what you need to know 188 00:08:53,970 --> 00:08:55,360 in order to move forward. 189 00:08:55,360 --> 00:08:57,120 So with that, we're going to move forward, 190 00:08:57,120 --> 00:08:58,920 and I'll see you in the next lesson.