Atomically (not automatically) Moving Files To A Network Drive With Python
It uses the virtually instant
os.rename(source_path, destination_path) where possible with a secondary process to deal with files crossing file systems.
The two changes I made were:
1] I converted it from
shutil.copyfile(source_path, destination_path) to
shutil.copy2(source_path, destination_path) since the later preserves more metadata (e.g. time stamps).
2] I added a check to see if the destination file exists on the external file system before making the move. This prevents the tmp file from being written if it wouldn’t be able to be moved into place. The script will error out faster as a result too.
I’m renaming a bunch of files and moving them to my NAS. If I wasn’t renaming them, I’d use rsync and be done with it . Instead, I’m using a Python script to handle the migration.
I use atomic actions when moving files. They either end up at the destination, or they don’t. There’s no in-between state where a file is only partially written.
Atomic moves are easy with Python’s built-in
os.rename(source_path, destination_path) function. It’s atomic out of the box. Unfortunately, it doesn’t work if you’re moving across file systems or network drives. Python’s
shutil.move() works across systems, but it’s not atomic. In fact, there’s no single built-in command that offers atomic moves across file systems. You have to make your own method.
The way to pull off an atomic move when .rename() isn’t an option is a three step process:
1] Copy the original file to a temporary location on the same file system as the final destination (this is a non-atomic operation).
2] Use the atomic os.rename() function to move the file to it’s final destination. (Just to highlight it, the reason the .rename() method works atomically now when it didn’t before is because the temporary file is on the same file system as the destination path whereas the original copy was not.)
3] Delete the original file.
The way I’ve always done this:
It had been a while since I looked to see if there was a better approach so I spent some time searching. That’s when I discovered Alex’s page. They did something wonderfully clever that never occurred to me. Use os.rename() when you can and then fall back to the tmp copy approach when you can’t. My code works fine in all circumstances, but their’s will run loads faster when operating on the local drive.
The other thing they added was the UUID naming for tmp files to avoid collisions. That’s not an issue for my processes, but I dig the approach.
So, this is the new file mover snippet in my grimoire.
- Atomic, cross-filesystem moves in Python – alexwlchan
- 8. Errors and Exceptions — Python 3.9.1 documentation
- How do I copy a file in Python? - Stack Overflow
- How to Copy a File in Python
- os — Miscellaneous operating system interfaces — Python 3.9.1 documentation
- pathlib — Object-oriented filesystem paths — Python 3.9.1 documentation
- python - A safe, atomic file-copy operation - Stack Overflow
- python - How do I check whether a file exists without exceptions? - Stack Overflow
- python - How to move a file? - Stack Overflow
- python - specifically handle file exists exception - Stack Overflow
- python copy file with meta data, shutil vs manually opening and writing a…
- Python os.rename() Method - Tutorialspoint
- Python Try Except
- Python | shutil.copy2() method - GeeksforGeeks
- re - Python os.rename, remove and error catching - Stack Overflow
- Searching for equivalent of FileNotFoundError in Python 2 - Stack Overflow
- shutil — High-level file operations — Python 3.9.1 documentation
- When should I use shutil.copyfile vs. shutil.copy in Python? - Quora