最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

How can I detect "sparse"download-on-demand files on Windows with Python? - Stack Overflow

programmeradmin1浏览0评论

I have a question, I have files on disk created by Dropbox as files which are downloaded on demand. I uninstalled dropbox, but files were left on disk. I can see them in folder structure, I can see them with their sizes and other attributes in file explorer.

When I check properties of file, I can see that file has 0 bytes size on disk, i.e. it is in fact not stored on drive at all.

Is there a way in python how to detect those files so that I can delete them all?

All functions in python I know will provide you with size of file only, i.e. size which you can see in attributes. But not real size on disk.

Added 31MAR2025 17:49: As I mentioned above, with python functions like os.path.getsize() or os.stat() I get size which is 895420bytes, but I need to get 0 as Size on disk.

Many thanks

Vladimir

I have a question, I have files on disk created by Dropbox as files which are downloaded on demand. I uninstalled dropbox, but files were left on disk. I can see them in folder structure, I can see them with their sizes and other attributes in file explorer.

When I check properties of file, I can see that file has 0 bytes size on disk, i.e. it is in fact not stored on drive at all.

Is there a way in python how to detect those files so that I can delete them all?

All functions in python I know will provide you with size of file only, i.e. size which you can see in attributes. But not real size on disk.

Added 31MAR2025 17:49: As I mentioned above, with python functions like os.path.getsize() or os.stat() I get size which is 895420bytes, but I need to get 0 as Size on disk.

Many thanks

Vladimir

Share Improve this question edited Mar 31 at 21:24 Charles Duffy 297k43 gold badges434 silver badges489 bronze badges asked Mar 31 at 11:35 Vladimir BuzalkaVladimir Buzalka 619 bronze badges 4
  • 1 The solution to your question consists of three steps, all of which are trivial and have been answered before: 1) recurse over all files, 2) detect if file is empty, 3) delete file. Please edit and point out where exactly you're struggling. – Friedrich Commented Mar 31 at 11:57
  • 1 Update my question - with known python functions I get size as you can see 895420 bytes, but I need to get zero as size on disk. – Vladimir Buzalka Commented Mar 31 at 15:51
  • @Friedrich, ...does the question as edited distinguish itself more clearly? – Charles Duffy Commented Mar 31 at 21:25
  • @CharlesDuffy I think so. OP's comment and previous edit was also helpful. – Friedrich Commented Mar 31 at 21:34
Add a comment  | 

3 Answers 3

Reset to default 2

I was playing around and it seems that those are special files (sparse file? - no idea what it is), but all my files as on picture above returns TRUE with function below:

def is_sparse_file_win(path):
    try:
        import ctypes
        from ctypes import wintypes

        FILE_ATTRIBUTE_SPARSE_FILE = 0x00000200
        GetFileAttributes = ctypes.windll.kernel32.GetFileAttributesW
        GetFileAttributes.argtypes = [wintypes.LPCWSTR]
        GetFileAttributes.restype = wintypes.DWORD

        attrs = GetFileAttributes(path)
        return attrs & FILE_ATTRIBUTE_SPARSE_FILE != 0
    except:
        return False

On Windows it is probably an offline file. Files mirrored from a Dropbox/OneDrive account are in the cloud and have a zero size until accessed at least once. GetFileAttributesW can be used to detect them and differentiate between an actual zero-sized file and a file that is only in the cloud. GetCompressedFileSizeW gets the actual size on disk.

My Windows desktop is backed up by OneDrive and the below code displayed only files that had FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_OFFLINE set and a compressed size of zero.

import os
import ctypes as ct
import ctypes.wintypes as w

NO_ERROR = 0

INVALID_FILE_SIZE = w.DWORD(0xFFFFFFFF).value
INVALID_FILE_ATTRIBUTES = w.DWORD(-1).value

FILE_ATTRIBUTE_OFFLINE = 0x00001000
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS = 0x00400000

# Helpers to raise exceptions on API failures.

def filesizecheck(result, func, args):
    if result == INVALID_FILE_SIZE and (err := ct.get_last_error()) != NO_ERROR:
        raise ct.WinError(err)
    return result

def attributecheck(result, func, args):
    if result == INVALID_FILE_ATTRIBUTES:
        raise ct.WinError(ct.get_last_error())
    return result

kernel32 = ct.WinDLL('kernel32', use_last_error=True)
GetCompressedFileSize = kernel32.GetCompressedFileSizeW
GetCompressedFileSize.argtypes = w.LPCWSTR, w.LPDWORD
GetCompressedFileSize.restype = w.DWORD
GetCompressedFileSize.errcheck = filesizecheck
GetFileAttributes = kernel32.GetFileAttributesW
GetFileAttributes.argtypes = w.LPCWSTR,
GetFileAttributes.restype = w.DWORD
GetFileAttributes.errcheck = attributecheck

PATH = 'OneDrive/Desktop'
for file in os.listdir(PATH):
    fullpath = os.path.join(PATH, file)
    local_size = GetCompressedFileSize(fullpath, None)
    attributes = GetFileAttributes(fullpath)
    if attributes & (FILE_ATTRIBUTE_OFFLINE |
                     FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS) and local_size == 0:
        print(f'0x{attributes:08X} {local_size:6d} {fullpath}')

Note that the Windows cmd.exe shell command attrib will list file attributes for files in the current directory and the O attribute indicates an offline file.

I would do something like this (Works on older python versions also):

import os

def list_files_recursive(path='.'):
    for entry in os.listdir(path):
        full_path = os.path.join(path, entry)
        if os.path.isdir(full_path):
            list_files_recursive(full_path)
        else:
            #If file is smaller than 10 kb
            if os.path.getsize(full_path) < (10 * 1024):
                print(full_path)

# Specify the directory path you want to search
directory_path = 'c:/users/me/dropbox/'
list_files_recursive(directory_path)

Sry I couldn't test the code so things could go wrong if you just test it like this but I would try to check recursively for any files that are in the Dropbox folder with a very small size (10kb)

发布评论

评论列表(0)

  1. 暂无评论