1 00:00:00,480 --> 00:00:02,700 Hello and welcome to a new session. 2 00:00:03,270 --> 00:00:09,720 In this video, we are going to talk about the principles of dot net and Java malware analysis. 3 00:00:11,920 --> 00:00:19,090 The first thing we are going to look at, the differences between interpreted it was compile programs, 4 00:00:19,600 --> 00:00:26,210 compile languages, convert the Soska into interpretive languages. 5 00:00:26,510 --> 00:00:31,120 This is being translated into Ambika representation. 6 00:00:32,710 --> 00:00:36,970 So examples of compiling these are like C o C++. 7 00:00:37,540 --> 00:00:42,790 But for interpretor languages have, for example, Java and ONet. 8 00:00:45,330 --> 00:00:53,460 So for computer languages, this is the workflow, the source code on the left here, CCP stands for 9 00:00:53,460 --> 00:01:00,000 C++, is being used by the compiler to convert it into machine code. 10 00:01:00,630 --> 00:01:06,720 And then when the program runs, a machine code is loaded into memory and instruction site operated 11 00:01:06,900 --> 00:01:09,130 by the CPU and execute it. 12 00:01:10,440 --> 00:01:14,280 So we have a machine code which is native to that particular machine. 13 00:01:16,650 --> 00:01:20,560 No interpreter languages, they can be slight differences. 14 00:01:21,030 --> 00:01:23,940 So, for example, is his second language. 15 00:01:24,570 --> 00:01:32,190 He has to go through a compiler aswell, the compiler, but also convert it into a B, far from it, 16 00:01:32,520 --> 00:01:33,810 which is called Ambika. 17 00:01:34,170 --> 00:01:36,870 By this, Piku is different from this machine. 18 00:01:37,800 --> 00:01:42,970 This machine could also be far from it by this machine or is native to this machine. 19 00:01:43,350 --> 00:01:47,610 That means he can definitely run in the CPU by here. 20 00:01:47,760 --> 00:01:53,910 Now bytecode, even though it is a B far from it, cannot run directly by the CPU. 21 00:01:54,450 --> 00:01:57,360 He needs to be interpreted by a virtual machine. 22 00:01:57,870 --> 00:02:05,510 And this which a machine could be a dot net framework or it could be a Java runtime environment. 23 00:02:06,210 --> 00:02:07,320 So that is a difference. 24 00:02:07,470 --> 00:02:10,080 But so the Bikel is an intermediate language. 25 00:02:10,740 --> 00:02:21,750 It is not complete native formic, but it is something in between the source code and a shingle now 26 00:02:21,750 --> 00:02:28,920 because in Java by code can be different back into the source code with relative ease, programmers 27 00:02:28,920 --> 00:02:34,280 use of discretion to make it harder to analyze the code after the completion. 28 00:02:35,130 --> 00:02:42,130 We have to say, can you compare this by code that into the accuracy? 29 00:02:42,360 --> 00:02:49,230 So that is why many programmers, including many authors who find ways to make it harder to analyze 30 00:02:49,230 --> 00:02:57,450 the decomposers, could slow that process or making it harder to decompiled back in understanding, 31 00:02:57,450 --> 00:03:00,030 analyze its called confiscation. 32 00:03:01,830 --> 00:03:03,860 Obfuscation comes in many forms. 33 00:03:04,170 --> 00:03:06,390 These are four of the main ones. 34 00:03:06,930 --> 00:03:11,010 We have string manipulation and or nonsensical. 35 00:03:11,010 --> 00:03:19,640 Nemi is one of the most prevalent ways we can also use inclusion of unnecessary code to slow down the 36 00:03:19,650 --> 00:03:20,380 analysis. 37 00:03:21,120 --> 00:03:24,720 Also, there could be use of encoding or encryption. 38 00:03:25,970 --> 00:03:33,020 And next is the N.T. analysis, which can also prevent your tools from analyzing. 39 00:03:35,670 --> 00:03:38,210 Nonsensical naming or fashion variables? 40 00:03:38,750 --> 00:03:45,810 This is an example, it, for example, here we have a method or function, and in this case you can 41 00:03:45,810 --> 00:03:50,020 see this technique is used to keep the name of the function. 42 00:03:50,550 --> 00:03:57,240 So when the analyst reaches, it has no clue as to what the purpose of this function. 43 00:03:58,230 --> 00:04:07,410 And so here you can also see all the functions of the various classes has been renamed to make it sensical 44 00:04:07,740 --> 00:04:09,120 and harder to understand. 45 00:04:11,260 --> 00:04:18,490 The next one is string manipulation, string manipulation is where you take a long, normal string and 46 00:04:18,880 --> 00:04:26,560 put it through a complicated process by breaking up the string into various parts and using functions 47 00:04:27,370 --> 00:04:33,580 on each part and then concatenating those functions to to build up the origin of the string. 48 00:04:34,060 --> 00:04:35,860 For example, you take a look at this here. 49 00:04:36,430 --> 00:04:44,310 The dysfunction gatefold above is to get in the path of the location where the file is located. 50 00:04:44,830 --> 00:04:51,890 However, you put through a complicated process where you call multiple functions. 51 00:04:52,330 --> 00:05:00,020 So, for example, here you have a function here which concatenate three of the double backslash and 52 00:05:00,020 --> 00:05:04,930 then that is concatenate that further with her decryption function. 53 00:05:05,410 --> 00:05:07,660 And in this case, it is he? 54 00:05:07,690 --> 00:05:15,370 S decryption function, which takes a parameter, this encrypted string and of course, the Sonisphere. 55 00:05:15,580 --> 00:05:20,930 And after this thing has gone through the functions, we get back to the original directory. 56 00:05:21,370 --> 00:05:29,170 So this is where a string containing the directory location is being deliberately broken up into various 57 00:05:29,170 --> 00:05:29,860 components. 58 00:05:30,220 --> 00:05:38,740 Each component is put through a function in order to convert it back to the strings of the concatenation. 59 00:05:40,360 --> 00:05:44,060 And next time, use uses unnecessary instruction. 60 00:05:44,410 --> 00:05:51,570 So in this example, you will see many functions being created, just a trophy analysis. 61 00:05:52,090 --> 00:05:58,710 If you take, for example, here the first function, if I see so on and look into it, it just returns 62 00:05:58,720 --> 00:05:59,260 a number. 63 00:06:00,010 --> 00:06:02,870 So this number is not used anywhere else in the code. 64 00:06:03,220 --> 00:06:10,120 So this actually is put to the analysis into confused, and that is whether in the same thing is true 65 00:06:10,120 --> 00:06:11,270 for all this as well. 66 00:06:11,720 --> 00:06:12,050 Yeah. 67 00:06:12,220 --> 00:06:14,850 Put that as unnecessary instructions. 68 00:06:15,670 --> 00:06:17,230 So that's all for this video. 69 00:06:17,440 --> 00:06:18,550 Thank you for watching.