最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Create a new file, moving an existing file out of the way if needed - Stack Overflow

programmeradmin3浏览0评论

What is the best way to create a new file in Python, moving if needed an existing file with the same name to a different path?

While you could do

if os.path.exists(name_of_file):
   os.move(name_of_file, backup_name)
f = open(name_of_file, "w")

that has TOCTOU issues (e.g. multiple processes could try and create the file at the same time). Can I avoid those issues using only the standard library (preferably), or is there a package which handles this.

You can assume a POSIX file system.

What is the best way to create a new file in Python, moving if needed an existing file with the same name to a different path?

While you could do

if os.path.exists(name_of_file):
   os.move(name_of_file, backup_name)
f = open(name_of_file, "w")

that has TOCTOU issues (e.g. multiple processes could try and create the file at the same time). Can I avoid those issues using only the standard library (preferably), or is there a package which handles this.

You can assume a POSIX file system.

Share Improve this question asked Apr 2 at 5:40 James TocknellJames Tocknell 5535 silver badges15 bronze badges 5
  • how important is the metadata of the existing file? would it be suitable to "move" it by reading the content, then writing it to the file at backup_name? – Aemyl Commented Apr 2 at 5:55
  • Why not add uuid to the file name name_of_file = f'my_file_{uuid.uuid4()}.txt'? – Guy Commented Apr 2 at 5:56
  • 1 What if backup_name already exists? – Adon Bilivit Commented Apr 2 at 7:09
  • What's the expected behavior of the other processes that try to create the file at the same time? Are they supposed to detect that another process is creating a new file and therefore wait until the new file is created and then simply write to the new file that another process just created? Or are they supposed to backup the newly created file and create yet another new file to write to? – blhsing Commented Apr 2 at 7:14
  • I agree with @Guy although I think the intention should be to add the UUID/GUID to the backup filename – Adon Bilivit Commented 2 days ago
Add a comment  | 

2 Answers 2

Reset to default 3

In a loop:

Open the file exclusively (open(..., "x") or O_CREAT|O_EXCL).

If that fails with FileExistsError (EEXIST), then atomically os.rename the existing file to something else. Try again.

If that renaming fails with anything other than FileExistsError (ENOENT, meaning someone else removed or renamed the offending file before you did), break the loop and fail.

(It's not clear to me how your competing processes can know it is OK to move the existing file out of the way or not, but presumably you've knowledge of your use case that I do not.)

I don't know if / how Python exposes these POSIX system calls, but in terms of system call names:

If you want to avoid having a moment where the main file doesn't exist, there are at least a couple ways:

  1. Create + write the new file to a temporary name on the same FS as the target (perhaps a random name in the same directory). Use open(O_CREAT|O_EXCL) to ensure you're creating a new file.
  2. link() the main filename to the backup filename, so both names refer to the main file (hard link). If this fails with EEXIST, unlink() the old backup file and try again.
  3. rename(temp, main) (or the equivalent renameat()) to atomically replace the main filename, leaving the backup filename as the only reference to the inode (plus any other hardlinks that already existed to the main file).

Until the rename at the end, the main filename has its original contents. It atomically changes to referring to the new file contents at that point.

There's no way to keep other hard links referring to the main file instead of what becomes the backup without rewriting the contents of that file, which for this would involve first copying it (or reflink duplicating like cp --reflink=always) to the backup file. And then having some time while the main file was being written with new contents, not being the old original or the complete new file. But that's generally expected for hard links; use symlinks if continuing to refer to the main file is important.


Or on Linux, the renameat2(..., RENAME_EXCHANGE) system-call allows atomically swapping two files, so you could open("backup", O_CREAT|O_TRUNC) and write the new file there, then swap. But that has maybe more confusing crash behaviour, where it's possible to end up with the backup filename holding a partially-written newer version of the file.

发布评论

评论列表(0)

  1. 暂无评论