1 00:00:00,090 --> 00:00:07,710 Let's talk about file I oh, what does I mean, you'll see this a lot in programming. 2 00:00:08,680 --> 00:00:11,260 IO stands for input output. 3 00:00:12,290 --> 00:00:19,940 You see, most of the times machines are not communicating in just one environment, for example, so 4 00:00:19,940 --> 00:00:27,380 far we've been writing our code in, let's say, an editor like sublime text that I have here or PAE 5 00:00:27,380 --> 00:00:29,450 Charm or an online Reppel. 6 00:00:30,200 --> 00:00:34,400 But usually you want to interact with different parts of the system. 7 00:00:35,090 --> 00:00:37,820 Maybe you want to speak to another website. 8 00:00:38,060 --> 00:00:45,320 Maybe you want to speak to something that's on your desktop, maybe a file, maybe two different machines 9 00:00:45,320 --> 00:00:46,940 are communicating with each other. 10 00:00:47,180 --> 00:00:48,920 Maybe you're speaking to a database. 11 00:00:49,520 --> 00:00:58,580 I simply means, hey, I want you to input something from the outside world and output something into 12 00:00:58,580 --> 00:00:59,420 the outside world. 13 00:01:00,550 --> 00:01:07,960 And one of the most common ways that we use things like IO is through reading files, you might think 14 00:01:07,960 --> 00:01:14,680 about Python and how wouldn't it be nice if we can perhaps write a script that compresses images while 15 00:01:14,680 --> 00:01:21,370 we need IO, then I need to input an image and then I need to update the compressed. 16 00:01:23,970 --> 00:01:30,400 Maybe I want to work with a PDF file and maybe add a watermark to all my PDF pages. 17 00:01:30,420 --> 00:01:35,430 Well, then we input a PDF and output a new version of a PDF. 18 00:01:35,760 --> 00:01:38,940 So this is a very common task that we see a lot of. 19 00:01:39,120 --> 00:01:44,100 And reading and writing files is a very important tool in our tool belt. 20 00:01:45,030 --> 00:01:49,860 And by the way, we have a project coming up where we actually are going to use this with PDF. 21 00:01:50,490 --> 00:01:54,930 But for now, how can we do this file input output with Python? 22 00:01:56,480 --> 00:02:03,770 Well, Python has a built in function that allows us to open and write to files. 23 00:02:05,130 --> 00:02:13,230 And it's simply called open, nice and easy, so using open, we can do something like this, I can 24 00:02:13,230 --> 00:02:19,230 create a sample text file in our desktop, so I'm going to use my terminal here. 25 00:02:20,380 --> 00:02:25,570 So if we look at present working directory, I'm on a desktop, if you're on Windows, then you might 26 00:02:25,570 --> 00:02:27,870 have to do some different commands in here. 27 00:02:27,880 --> 00:02:32,430 But by now, you should be pretty familiar that you can create a file if you wanted to. 28 00:02:32,770 --> 00:02:35,800 Now, I can do this manually or in my terminal. 29 00:02:35,800 --> 00:02:36,880 I can actually do. 30 00:02:36,880 --> 00:02:41,770 If you're on a Mac, you can just do touch and then I'll say test dot text. 31 00:02:42,400 --> 00:02:49,510 So if I do else here you see that I have a test text file and a script before. 32 00:02:51,000 --> 00:02:53,580 And again, just to double check, if I go to my desktop. 33 00:02:55,010 --> 00:03:03,380 Yep, I have these two right here, perfect now in here, we can simply say in our script file. 34 00:03:05,700 --> 00:03:06,210 Open. 35 00:03:07,830 --> 00:03:16,770 And the name of the file in our case, it's test heartstring test, dot text just like that. 36 00:03:17,950 --> 00:03:22,000 Now, I can assign this to a variable, calling it my file. 37 00:03:23,290 --> 00:03:31,150 And now we have, well, the file object, so let's check this out if I do here, print my file. 38 00:03:31,210 --> 00:03:32,070 Let's see what happens. 39 00:03:32,410 --> 00:03:34,050 I'm going to run my code. 40 00:03:34,090 --> 00:03:37,120 So let's say Python three and then run script. 41 00:03:41,470 --> 00:03:49,240 I get an object, a textile wrapper, I get the name, which is text's I get mowed, which I'm not sure 42 00:03:49,240 --> 00:03:55,420 what it is yet, we'll learn and then encodings, which is how this file is encoded, which is UTF eight. 43 00:03:55,420 --> 00:03:58,300 Most of the files are usually encoded in UTF eight. 44 00:03:59,420 --> 00:04:02,390 All right, so how can I actually read this file? 45 00:04:03,230 --> 00:04:05,780 All we need to do is my file has a dot. 46 00:04:07,670 --> 00:04:11,150 Read method on it so that if I run this now. 47 00:04:12,940 --> 00:04:20,220 Well, I get a blank piece of space because there's nothing on this test text, so let's write something. 48 00:04:20,620 --> 00:04:29,350 I'll open this file and sublime text and just write, Hi, my name is Andre Agassi. 49 00:04:29,620 --> 00:04:31,630 I got really, really creative with this one. 50 00:04:31,630 --> 00:04:32,140 Good job. 51 00:04:32,560 --> 00:04:33,700 All right, let's go back. 52 00:04:35,300 --> 00:04:36,710 So now if I read this file. 53 00:04:38,330 --> 00:04:42,460 Look at that, I'm able to read hi, my name is Andre, and now there you go. 54 00:04:42,650 --> 00:04:43,580 I've read my file. 55 00:04:44,900 --> 00:04:45,690 Nice and easy. 56 00:04:46,130 --> 00:04:48,440 Now, if I run this again, let's just. 57 00:04:49,680 --> 00:04:50,880 Read this multiple times. 58 00:04:51,990 --> 00:04:52,740 If I click, run. 59 00:04:56,120 --> 00:05:01,760 I'm able to read the first time around, but these two times I'm not reading anything. 60 00:05:02,030 --> 00:05:03,080 Why is that? 61 00:05:04,160 --> 00:05:08,330 Well, this open function has this idea of a curser. 62 00:05:09,920 --> 00:05:12,530 That is, you can only read the file once. 63 00:05:13,470 --> 00:05:17,190 And once you open, it returns a file object. 64 00:05:18,250 --> 00:05:23,860 And the contents of the file you can read and the contents of the file are read with a cursor just like 65 00:05:23,860 --> 00:05:27,160 you see here one by one and printed onto the screen. 66 00:05:27,550 --> 00:05:33,190 But by the end of this first reading, the cursor is going to be at the end of the file. 67 00:05:33,550 --> 00:05:38,660 So now when it tries to read, it's going to be end of the file and nothing will be left there. 68 00:05:39,490 --> 00:05:43,660 So the way we get around this is to do something like this. 69 00:05:43,840 --> 00:05:53,800 We simply say my file dot seke, which moves our cursor to whatever index we want in our case, seek 70 00:05:53,800 --> 00:05:54,170 zero. 71 00:05:54,730 --> 00:05:57,300 So if I run this now, there you go. 72 00:05:57,550 --> 00:06:00,160 And if I move the cursor back. 73 00:06:01,760 --> 00:06:03,590 And I save and run this. 74 00:06:03,740 --> 00:06:05,440 All right, that's a lot better now. 75 00:06:06,980 --> 00:06:12,560 And this is just to demonstrate that Python uses this idea of a cursor to read a file. 76 00:06:13,940 --> 00:06:19,340 Now, another unique thing that I can do is to do read line. 77 00:06:20,910 --> 00:06:22,830 So that if I run this. 78 00:06:23,910 --> 00:06:30,900 It reads the line, but let's say our text file has different lines, so how are you? 79 00:06:32,260 --> 00:06:36,070 Let's say a smiley face in here, if I now read the line. 80 00:06:38,700 --> 00:06:45,660 I get hi, my name is Andre, if I read the line again, hi, my name is Andre because I get each line. 81 00:06:46,950 --> 00:06:51,390 I only get the first line if I print this multiple times. 82 00:06:53,430 --> 00:06:54,480 And I run this. 83 00:06:55,350 --> 00:06:57,090 All right, that's better I get. 84 00:06:57,120 --> 00:06:58,680 Hi, my name is Andre Smiley Face. 85 00:06:58,710 --> 00:06:59,220 How are you? 86 00:06:59,790 --> 00:07:05,040 Again, the cursor keeps moving, right, so I can just keep reading the lines. 87 00:07:05,580 --> 00:07:10,110 Another thing that I can do is to just say read lines if I run this. 88 00:07:11,450 --> 00:07:19,190 I get a list that contains the entire file, reads all the lines, and you see here that I have hi, 89 00:07:19,190 --> 00:07:20,920 my name is Andre Nagui. 90 00:07:21,170 --> 00:07:25,610 I have a new line here, a smiley face, a new line. 91 00:07:25,800 --> 00:07:27,650 Remember, this is escape sequencing. 92 00:07:28,250 --> 00:07:29,240 And then how are you? 93 00:07:29,450 --> 00:07:32,300 And no new line here because there's no new line. 94 00:07:32,300 --> 00:07:33,800 It's the end of the file. 95 00:07:35,150 --> 00:07:43,760 So these are extremely useful, maybe we can use regular expressions to search for a piece of text in 96 00:07:43,760 --> 00:07:45,430 a file that's pretty useful, right? 97 00:07:47,670 --> 00:07:53,490 Now, the very last thing we need to do, and this is a little annoying, so I'll show you how we don't 98 00:07:53,490 --> 00:07:54,860 need to do it in the future. 99 00:07:56,730 --> 00:08:02,880 But you actually have to manually close the file after you've opened it with open. 100 00:08:04,580 --> 00:08:10,430 So you can use it somewhere else and the program is just a good standard, so what you have to do is 101 00:08:10,430 --> 00:08:13,750 say my file dot close after you're done with it. 102 00:08:15,620 --> 00:08:16,730 You tell your computer, hey. 103 00:08:17,970 --> 00:08:22,550 You need to stop whatever you're doing, I'm not interested in the fall anymore, we're done with it. 104 00:08:22,560 --> 00:08:23,760 You can use it somewhere else. 105 00:08:24,190 --> 00:08:27,540 So usually we do something like this and we're all done. 106 00:08:28,740 --> 00:08:31,290 Let's take a break and learn some more in the next video.