Find

Find is a great tool, though often infuriatingly hard to remember exactly how to do things

As it is quite powerful and has a lot of flags, it can be tricky to remember how to do things.

Here is a quick run down and best practices

Always Use Absolute Path

Especially if you are on a server, you can avoid misery by always using an absolute path to search in.

If you are ever going to actually do anything with find, such as delete files, then if you use . (current directory) then you have a trap waiting in your command history that can (and does) cause accidental server destruction.

You can completely mitigate this risk by using an absolute path, so worst case, your previous command will be rerun in the directory it was originally intended.

DANGEROUS!

1
find . -type f -exec rm -rf {} \;

SAFE :)

1
find /abs/path/to/folder/with/things/to/delete -type f -exec rm -rf {} \;

Deleting Files

A common use case of find is to find and remove files

For example, a folder of logs or backups, you might want to delete files that are older than a certain number of days.

To do this, you should - as described above - use the absolute path to the folder. You should also use the most accurate file name mask you can. Using * is the most dangerous, much better would be something like *.gz to only delete the .gz files that you are planning to delete.

When crafting your deletion command, always run it without the deletion first, just to be absolutey sure what you are going to delete.

Deleting Files Older Than X Days

Find Command:

1
2
numDays=7
find /abs/path/to/folder -type f -mtime +${numDays}

Find and Delete Command:

1
2
numDays=7
find /abs/path/to/folder -type f -mtime +${numDays} -exec rm -rf {} \;

Finding Big Files or Folders

If you want to search a directory for big files, you can have a read of this article

The best recommendation for doing this is:

First, Find Big Files and Save to List

1
find -type f -size +10M -exec du -Sh {} + | sort -rh > /tmp/filesList.txt

You now have a text file you can look at and process further as you see fit, without having to repeat a potentially expensive find command

Second, Filter the List

Now you have a big list, but it no doubt includes things that you do not want to delete.

You can repeatedly grep through this list to filer things out like this:

1
2
3
4
5
6
7
8
cd /tmp
cp filesList.txt filesListFiltered.txt
#Filter by path:
sed -i '/\/abs\/path\/to\/folder\/containing\/things\/we\/should\/not\/delete/d' filesFilteredList.txt
#Filter by extension:
sed -i '/\.ext/d' filesListFiltered.txt
#Remove git paths
sed -i '/\.git\//d' filesListFiltered.txt

Third, Pull out Directories

Now you have a list of big files, lets get the directories they exist in:

1
grep -Po '/(.+)/' filesListFiltered.txt | sort -u

To look at the biggest files/folders in a directory, try:

1
2
3
4
5
6
7
for directory in $(grep -Po '/(.+)/' /tmp/filesListFiltered.txt | sort -u);
do
    echo "$directory";
    cd "/var/www/vhosts/$directory";
    du -hs * | sort -rh | head -n 5;
    printf "\n\n\n";
done

From here, Take Action

At this point, you should start to have a good idea of where you need to look to delete your big files

Warning

Always be very paranoid when doing automated deletions on a live server!