I'm trying to find a reliable way to detect whether a file has changed, so I can avoid recomputing its SHA-256 hash. I'm manually caching the hash codes for performance reasons, and I want to safely reuse them when files haven't changed.
Here’s my setup and current thinking:
Based on my research, the most reliable way to determine if two files are identical is by comparing their SHA-256 hashes.
I'm scanning all files in a source directory and comparing them to files in a target directory, based on their content.
To improve performance, I want to cache the SHA-256 hashes of files in the source directory and skip re-hashing files that haven't changed on future scans.
However, I understand that checking only the last modified time (even combined with name or size) may not be reliable — file content can be changed without modifying mtime (e.g., if spoofed manually or using OS-level tricks).
My question is:
Is there any way to detect whether a file’s contents have changed — faster than recomputing its hash — so I can safely decide whether to trust a previously cached SHA-256 hash?
I'm looking for cross-platform solutions (Windows, macOS, and Linux), but OS-specific techniques are welcome too.
Thanks in advance!