Git Smudge and Clean Filters: Making Changes So You Don’t Have To
Oops… I Didn’t Mean To Commit That
Sometimes software development requires us to make local changes to files to perform our daily work, but those changes must not be committed back to the source code repository. And in our day-to-day routine, we do the thing that must not be done—we commit a change we didn’t mean to commit. Look, stuff happens, but then we’re incanting esoteric git commands to revert state, and while recoverable, the flow breakage is unwelcome. If you work in such a situation, git smudge and clean filters may be your solution.
When Push Came to Smudge
I worked on a project where we lacked direct deployment access, which meant build identifiers within the code repository had to remain stable for deployment. Unfortunately, we couldn’t do our daily work with those identifiers and had to maintain constant local changes to some files. Workable, but one of those irritations that add up – elimination would remove friction.
How to Smudge
At the root of the repository is a .gitattributes
file. It might look like this:
project.pbxproj filter=munge-project-identifier
The .gitattributes
file affects everyone who clones the repository since it’s committed to the repository. However, the filter definition – what munge-project-identifier
means – is not. If someone’s git config does not define that filter, this attribute won’t do anything. This means that everyone gets the .gitattributes
, but actually applying the filter is opt-in. In my case, the build-to-deployment environment didn’t want these changes, just us developers, so we had to help all developers apply the filter. That is one downside: it’s totally quiet, so failures aren’t readily surfaced.
Use Scripts
While it’s permitted to define the filter inline, that’s useful for only the simplest of filters. Furthermore, if multiple developers accessing the codebase all need to apply the filter, it should be easy for everyone to adopt without error. So we use scripts.
Let’s say I had to change an identifier from com.blah.user-thing
to com.blah.user-bnr
. I would create a script for each in a scripts/
folder committed to the repository:
scripts/git-filter-smudge-project-identifier.sh
sed -e 's/com.blah.user-thing/com.blah.user-bnr/'
scripts/git-filter-clean-project-identifier.sh
sed -e 's/com.blah.user-bnr/com.blah.user-thing/'
Edit Your Local Git Config
Using a text editor, edit the $(PROJECTDIR)/.git/config
file to add the smudge and clean filters:
[filter "munge-project-identifier"]
smudge = /Users/hsoi/Documents/BNR/Development/Projects/Fred/code/scripts/git-filter-smudge-project-identifier.sh
clean = /Users/hsoi/Documents/BNR/Development/Projects/Fred/code/scripts/git-filter-clean-project-identifier.sh
Or using git directly:
$ git config --local filter.munge-project-identifier.smudge /Users/hsoi/Documents/BNR/Development/Projects/Fred/code/scripts/git-filter-smudge-project-identifier.sh
$ git config --local filter.munge-project-identifier.clean /Users/hsoi/Documents/BNR/Development/Projects/Fred/code/scripts/git-filter-clean-project-identifier.sh
It’s intentional to use the absolute paths, though it’s possible to support relative paths in the .git/config
file, but there’s more work. These changes are local per developer, so an absolute path is sufficient.
It’s Magic ✨
Once all of this is in place:
git status
will show your working copy is clean.- Examination of the local/working copy of the file, will show the changes: I see
com.blah.user-bnr
in my file. - Examination of the remote/committed copy of the file, will show the unchanged original: I see
com.blah.user-thing
in my remote file. - Whenever you work, you should never notice the changes in the file – other than everything works and friction has been removed! They should never be committed, and life should be magical. ????
Troubleshooting
It’s possible that despite the above changes the “old” data still shows and the filter is not applied. Here are a couple of things I’ve tried:
First, double-check that all steps, names, and paths are correct.
Second, try deleting and restoring the file(s) affected by the filter. I would delete the file directly (e.g. go into the Finder and Trash the file), then use git (e.g. git reset
) to restore the file via a git mechanism. This should trigger git’s hook to apply filters.
If there are still problems, or you want to learn more nitty-gritty about git attribute keyword expansion support (what “git smudge and clean” is all about), you can check the official documentation: “Customizing Git Attributes: Keyword Expansion”.
Smudge Away!
Git smudge and clean filters are a little nugget hidden away in the corner of git esoterica. But once you know about and use them, the friction they remove helps your day run smoother. It’s these sorts of efficiencies that we tend to build into all the work we do. If you’d like to learn more about our process, schedule a chat with one of our friendly Nerds!
Image credit: https://git-scm.com/downloads/logos