• Home
  • Blog
  • Python
  • If you don’t use “with”, when does Python close files? The answer is: It depends.

If you don’t use “with”, when does Python close files? The answer is: It depends.

January 18, 2015 . By Reuven

One of the first things that Python programmers learn is that you can easily read through the contents of an open file by iterating over it:

f = open('/etc/passwd')
for line in f:
    print(line)

Note that the above code is possible because our file object “f” is an iterator. In other words, f knows how to behave inside of a loop — or any other iteration context, such as a list comprehension.

Most of the students in my Python courses come from other programming languages, in which they are expected to close a file when they’re done using it. It thus doesn’t surprise me when, soon after I introduce them to files in Python, they ask how we’re expected to close them.

The simplest answer is that we can explicitly close our file by invoking f.close(). Once we have done that, the object continues to exist — but we can no longer read from it, and the object’s printed representation will also indicate that the file has been closed:

>>> f = open('/etc/passwd')
>>> f
<open file '/etc/passwd', mode 'r' at 0x10f023270>
>>> f.read(5)
'##\n# '

f.close()
>>> f
<closed file '/etc/passwd', mode 'r' at 0x10f023270>

f.read(5)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-11-ef8add6ff846> in <module>()
----> 1 f.read(5)
ValueError: I/O operation on closed file

But here’s the thing: When I’m programming in Python, it’s pretty rare for me to explicitly invoke the “close” method on a file. Moreover, the odds are good that you probably don’t want or need to do so, either.

The preferred, best-practice way of opening files is with the “with” statement, as in the following:

with open('/etc/passwd') as f:
    for line in f:
        print(line)

The “with” statement invokes what Python calls a “context manager” on f. That is, it assigns f to be the new file instance, pointing to the contents of /etc/passwd. Within the block of code opened by “with”, our file is open, and can be read from freely.

However, once Python exits from the “with” block, the file is automatically closed. Trying to read from f after we have exited from the “with” block will result in the same ValueError exception that we saw above. Thus, by using “with”, you avoid the need to explicitly close files. Python does it for you, in a somewhat un-Pythonic way, magically, silently, and behind the scenes.

But what if you don’t explicitly close the file? What if you’re a bit lazy, and neither use a “with” block nor invoke f.close()?  When is the file closed?  When should the file be closed?

I ask this, because I have taught Python to many people over the years, and am convinced that trying to teach “with” and/or context managers, while also trying to teach many other topics, is more than students can absorb. While I touch on “with” in my introductory classes, I normally tell them that at this point in their careers, it’s fine to let Python close files, either when the reference count to the file object drops to zero, or when Python exits.

In my free e-mail course about working with Python files, I took a similarly with-less view of things, and didn’t use it in all of my proposed solutions. Several people challenged me, saying that not using “with” is showing people a bad practice, and runs the risk of having data not saved to disk.

I got enough e-mail on the subject to ask myself: When does Python close files, if we don’t explicitly do so ourselves or use a “with” block? That is, if I let the file close automatically, then what can I expect?

My assumption was always that Python closes files when the object’s reference count drops to zero, and thus is garbage collected. This is hard to prove or check when we have opened a file for reading, but it’s trivially easy to check when we open a file for writing. That’s because when you write to a file, the contents aren’t immediately flushed to disk (unless you pass “False” as the third, optional argument to “open”), but are only flushed when the file is closed.

I thus decided to conduct some experiments, to better understand what I can (and cannot) expect Python to do for me automatically. My experiment consisted of opening a file, writing some data to it, deleting the reference, and then exiting from Python. I was curious to know when the data would be written, if ever.

My experiment looked like this:

f = open('/tmp/output', 'w')
f.write('abc\n')
f.write('def\n')
# check contents of /tmp/output (1)
del(f)
# check contents of /tmp/output (2)
# exit from Python
# check contents of /tmp/output (3)

In my first experiment, conducted with Python 2.7.9 on my Mac, I can report that at stage (1) the file existed but was empty, and at stages (2) and (3), the file contained all of its contents. Thus, it would seem that in CPython 2.7, my original intuition was correct: When a file object is garbage collected, its __del__ (or the equivalent thereof) flushes and closes the file. And indeed, invoking “lsof” on my IPython process showed that the file was closed after the reference was removed.

What about Python 3?  I ran the above experiment under Python 3.4.2 on my Mac, and got identical results. Removing the final (well, only) reference to the file object resulted in the file being flushed and closed.

This is good for 2.7 and 3.4.  But what about alternative implementations, such as PyPy and Jython?  Perhaps they do things differently.

I thus tried the same experiment under PyPy 2.7.8. And this time, I got different results!  Deleting the reference to our file object — that is, stage (2), did not result in the file’s contents being flushed to disk. I have to assume that this has to do with differences in the garbage collector, or something else that works differently in PyPy than in CPython. But if you’re running programs in PyPy, then you should definitely not expect files to be flushed and closed, just because the final reference pointing to them has gone out of scope. lsof showed that the file stuck around until the Python process exited.

For fun, I decided to try Jython 2.7b3. And Jython exhibited the same behavior as PyPy.  That is, exiting from Python did always ensure that the data was flushed from the buffers, and stored to disk.

I repeated these experiments, but instead of writing “abc\n” and “def\n”, I wrote “abc\n” * 1000 and “def\n” * 1000.

In the case of Python 2.7, nothing was written after the “abc\n” * 1000. But when I wrote “def\n” * 1000, the file contained 4096 bytes — which probably indicates the buffer size. Invoking del(f) to remove the reference to the file object resulted in its being flushed and closed, with a total of 8,000 bytes. So in the case of Python 2.7, the behavior is basically the same regardless of string size; the only difference is that if you exceed the size of the buffer, then some data will be written to disk before the final flush + close.

In the case of Python 3, the behavior was different: No data was written after either of the 4,000 byte outputs written with f.write. But as soon as the reference was removed, the file was flushed and closed. This might point to a larger buffer size. But still, it means that removing the final reference to a file causes the file to be flushed and closed.

In the case of PyPy and Jython, the behavior with a large file was the same as with a small one: The file was flushed and closed when the PyPy or Jython process exited, not when the last reference to the file object was removed.

Just to double check, I also tried these using “with”. In all of these cases, it was easy to predict when the file would be flushed and closed: When the block exited, and the context manager fired the appropriate method behind the scenes.

In other words: If you don’t use “with”, then your data isn’t necessarily in danger of disappearing — at least, not in simple simple situations. However, you cannot know for sure when the data will be saved — whether it’s when the final reference is removed, or when the program exits. If you’re assuming that files will be closed when functions return, because the only reference to the file is in a local variable, then you might be in for a surprise. And if you have multiple processes or threads writing to the same file, then you’re really going to want to be careful here.

Perhaps this behavior could be specified better, and thus work similarly or identically on different platforms? Perhaps we could even see the start of a Python specification, rather than pointing to CPython and saying, “Yeah, whatever that version does is the right thing.”

I still think that “with” and context managers are great. And I still think that it’s hard for newcomers to Python to understand what “with” does. But I also think that I’ll have to start warning new developers that if they decide to use alternative versions of Python, there are all sorts of weird edge cases that might not work identically to CPython, and that might bite them hard if they’re not careful.

Enjoyed this article? Join more than 11,000 other developers who receive my free, weekly “Better developers” newsletter. Every Monday, you’ll get an article like this one about software development and Python:



 

Related Posts

Prepare yourself for a better career, with my new Python learning memberships

Prepare yourself for a better career, with my new Python learning memberships

I’m banned for life from advertising on Meta. Because I teach Python.

I’m banned for life from advertising on Meta. Because I teach Python.

Sharpen your Pandas skills with “Bamboo Weekly”

Sharpen your Pandas skills with “Bamboo Weekly”
  • I have a question about the with block. once the with block is done the file is closed. Is it possible for an error to occur when the with block closes the file? and if so how does one handle the exception(with knowledge that it’s from a close operation, not an open operation)?

    • I guess it’s possible that you could encounter an error when closing a file, but I’m not sure what would happen, what exception you would get, or what you could do about it.

      The call to “close” happens in the __exit__ method, which is invoked when the “with” block is finishing things up. I’m not sure if it catches exceptions there, or what happens if one occurs.

  • Newcomers aside, to this day it irks me that I can’t safely do

    for line in open(“file.txt”): …

    and instead have to add a level of nesting.

  • I just recently read this article. Thanks, it was helpful. I’m new and novice with Python. I wrote a couple of programs for our community radio station to write data to a file, say every 5 minutes. I used an open, write, and close sequence instead of the ‘with’. These programs have run reliably for months, but there is always reason to improve. RecentIy I noticed that when I opened a file with a text editor, the most recent line didn’t appear. I discovered that my code had “f.close”, not “f.close()”. The interesting thing is, that doesn’t raise an error on a Mac or Windows10 computer. I did a little testing to confirm the behavior Do you have any idea why Idle would be OK with “f.close” and what if anything “f.close” does. In the end you’ve got me converted, I changed over to “with”.

    • Alexander says:

      f.close is returns the method object; as you learned the hard way, it does not actually call that method. As a demonstration of this, you can try `method = f.close`, in which case calling `method()` would be the same as calling `f.close()`.

      It does not raise an error because that is a valid statement (although it does not do anything). These statements are actually very common in a REPL shell (i.e., if you run python in a terminal/console/IDLE’s shell). There you might type something like 2 + 2 and get the answer 4 printed back. You get no error, so this is a valid statement, and you can write it on its own line in a Python script, and it won’t raise an error. Since you’re not storing it in a variable, the result (4) will be discarded, and the statement will have no effect. The same happens with the statement f.close: it returns a method object (and if you type it into a shell, it will output something like “). However, since you’re not doing anything with that object, the statement has no effect.

      • Alexander says:

        I meant that if you type f.close into a shell, it will output something like:

        • Alexander says:

          something like:

          built-in method close of _io.TextIOWrapper object at 0x10364b6c0

          Sorry, keeps getting deleted because of the angle brackets.

  • Arrived here while myself trying to determine how best to approach this with beginner-intermediate students; my conclusion is that (a rarity for Python) the open() syntax / context managers etc is just confusing and might best be avoided altogether if possible unless the students really need the given knowledge.

    My approach will be to primarily teach without using `with` but will mention it briefly as a structure they may come across and briefly give the reasoning

  • […] Reading entire file in Python The Python Tutorial – Input and Output If you don’t use “with”, when does Python close files? The answer is: It depends. […]

  • Shanice A. Bryan says:

    Hi,

    I realized that one of the limitations of using the “with” is that you have to be careful if you have different pointer objects to the same filename and they are in different functions.

    In the above scenario, you actually have to explicitly close the file or else when you write to the file, it won’t be how you had intended for it to be, even despite having your print statement sequentially correct.

    However, this maybe due to the fact that I was using it in “a+” mode.

  • […] 本网站用的阿里云ECS,推荐大家用。自己搞个学习研究也不错 本文由 伯乐在线 – 美洲豹 翻译,艾凌风 校稿。未经许可,禁止转载!英文出处:blog.lerner.co.il。欢迎加入翻译组。 […]

  • Chris Wilson says:

    The undefined nature of the result otherwise, is a perfect argument for using “with”.

    It would also constrain implementations unnecessarily and add to the baggage of the language, to specify what happens when you don’t explicitly close a file, or when it should be garbage collected. The simple answer is: “Don’t do that. If you want defined results, just close the file already.”

    Oh, and your captcha timeout is too short, it timed out while reading your article and writing this reply, and I nearly lost my reply. (Luckily my browser had saved it locally, so when I clicked on the back button, it was still there.) Have you thought about using Disqus instead of rolling your own comment system?

    • A good point.

      As for the comment system, I’m just using what is built into WordPress, along with a plugin that does Captchas. I took a long time to include those, because I hate them so much, but after getting incredibly amounts of comment spam, I threw up my hands and decided to do that. Using Disqus isn’t a bad idea, but for now, the number of comments is low enough that I don’t see it as a major issue. (You could argue that this is why I have so few comments, of course…)

  • Hello !

    Very interesting article, thanks.

    I actually have a argument in favour of using `with` :
    if ever the program gets interrupted in an abnormal way, say by a system signal under Linux, the file won’t ever be written to disk.

    You can test this very easily:

    import os, signal
    open(‘/var/tmp/out.txt’, ‘w+’).write(‘WRITTEN\n’)
    os.kill(os.getpid(), signal.SIGTERM)

    Then check : ‘/var/tmp/out.txt’ will be empty.

    Regards.

  • IronPython behaves the same as Jython and PyPy – the file is closed much later. This caused some problems for me, so I needed to either use a context manager or manually invoke close.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
    >