I use a remote Git repository for version control of a Matlab/Simulink project.
Over time, this repository grew over a few hundred MB because each commit done on Simulink files (*.slx
, *.sldd
, ...) is stored in binary files.
Shallow cloning at depth 1 remains below 50 MB, but doesn't allow to push back new commits to the remote repository if modificatons were pushed by others in-between.
In order to reduce repository size, I would like to delete automatically some older binary files with following rules:
- Keep all current files
- Keep commit history after a given date or a given commit ID
- Delete the binary files commited before this date or replace them by an empty file if they aren't still in use in HEAD
I tried to use git_filter_repo:
git clone --mirror <url> mymirror
cd mymirror
git filter-repo --path-glob "*.slx" --invert-paths --refs HEAD~50 --refs mybranch
Idea was to keep the 50 commits behind HEAD, and to remove the .slx
files after that.
Not fully happy with this idea because it might also remove old files still used in HEAD but checked in before the 50 last commits.
Unexpected result was that the whole branch mybranch
disappeared from mirrored repository and that I still find all .slx
files in the commits behind the 50 last commits in all other branches. Looks like --invert-paths
acts globally.
How to achieve this repository clean up with git_filter_repo
or any other git
solution?
I use a remote Git repository for version control of a Matlab/Simulink project.
Over time, this repository grew over a few hundred MB because each commit done on Simulink files (*.slx
, *.sldd
, ...) is stored in binary files.
Shallow cloning at depth 1 remains below 50 MB, but doesn't allow to push back new commits to the remote repository if modificatons were pushed by others in-between.
In order to reduce repository size, I would like to delete automatically some older binary files with following rules:
- Keep all current files
- Keep commit history after a given date or a given commit ID
- Delete the binary files commited before this date or replace them by an empty file if they aren't still in use in HEAD
I tried to use git_filter_repo:
git clone --mirror <url> mymirror
cd mymirror
git filter-repo --path-glob "*.slx" --invert-paths --refs HEAD~50 --refs mybranch
Idea was to keep the 50 commits behind HEAD, and to remove the .slx
files after that.
Not fully happy with this idea because it might also remove old files still used in HEAD but checked in before the 50 last commits.
Unexpected result was that the whole branch mybranch
disappeared from mirrored repository and that I still find all .slx
files in the commits behind the 50 last commits in all other branches. Looks like --invert-paths
acts globally.
How to achieve this repository clean up with git_filter_repo
or any other git
solution?
2 Answers
Reset to default 2Both git filter-branch
and git filter-repo
require a branch. So to rewrite old commits we create a branch pointing to the @~50
, filter the branch and then re-parent new commits @~50..@
on top of the filtered branch; it would be impossible to rebase — to many conflicts; many thanks to @jthill for providing helpful guidence about git replace
. These commands work for me (I used different wildcard *.txt
):
git branch WIP @~50
git filter-repo --path-glob "*.slx" --invert-paths --refs WIP
git replace @~50 WIP
git filter-repo --replace-refs update-no-add
git branch -D WIP
Probably the simplest way to keep local clones small when dealing with (effectively) frequently changing media files, is to filter out all the large blobs initially and let Git fetch those on demand:
git clone --filter=blob:limit=256k u://r/l
and that'll get you all the history that comes in <256KB-sized files and leave the rest out. Git will go back to the origin repo for anything it discovers it needs for a checkout or whatever.
git clone --depth 1 file://$PWD `mktemp -d` && cd $_ && echo >file && git add file && git commit -m- file && git push origin HEAD:refs/heads/kilroy
– jthill Commented 2 days agogit branch WIP @~50; git filter-branch --etc -- WIP
– jthill Commented 2 days agogit fetch --unshallow
beforepush
on my side, which is equivalent to full clone – Waldi Commented 2 days agogit replace @~50 WIP && git filter-branch -- WIP~..@
– jthill Commented 2 days agogit replace --graft @~49 && git filter--branch @
to do it all at once, what was I thinking. – jthill Commented 2 days ago