The power to find what you need

As part of what I do, I often end up working in the command line in Linux.  As anyone that works in IT can tell you log files are very important.  Something that they might not mention is, they also take up a lot of space depending on the level of logging being done.  Recently I had need to free up some space for log file on the server, I didn’t want to get rid of any files, just compress the older files. Chances are I wouldn’t need them, but if I did, they would still be there.  To accomplish this, I decided to use find, a Linus command line tool that, you guessed it, finds “stuff”.  Find is useful in that the command line arguments you pass it can quickly sort through and return a very unique set of results.  Find also automatically recurses downward through the sub directories, so you get a comprehensive list, you can change that through a command line option as well.

So, the first thing I wanted to see is all files (I’m not interested in directories) in the directory designated for logging that were older than 10 days (a completely random selection of number of days, a little more than a week, less than two weeks).  That’s easy enough to do, I just enter

find -type f -mtime +10

and instantly got a whole slew of files returned.  So, let’s break down what I did, “-type f” tells it I’m looking only for file, no directories and the “-mtime +10” tell the program to only give me file that have a “last modified” date OLDER (+) that 10 days.  I notice a few files that have a .gz at the end which means they’ve already been compressed, so I don’t need them in the list.  Doing a bit of searching, I find that by adding “-not” in front of any option it return the opposite (props to  Knowing that the “-name ” option will return files based on wild card matching I add that to the mix and end up with:

find -not -name "*.gz" -type f -mtime +10

That looks better, giving me a shorter list, and NONE of the files listed end in .gz. Now the next step is to do something with those files, and this is where the, IMHO, the true power of find comes into play.  You can pass a command line argument to find that tells it to “do something” with the files it finds.  Just to make sure I’m not making any crazy mistakes, the first thing I try is something simple and non harming, like listing the full information for the file:

find . -not -name "*.gz" -type f -mtime +10 -exec ls -alh {} \;

The -exec command line argument is great in that it will “execute” everything after “-exec” up to the “\;”.  The “{}” tells it to replace the result This means you can string along several commands, although at my level, I usually just want to do one thing at a time.  I like seeing each step and, by doing one step at a time, chances are I’ll find mistakes before it’s too late.

So, this is great, I have a listing of files, but what I really want to do is compress those files.  Now that I have a means of listing the files, and the results are what I’m expecting, I can replace the “command” with what I really want to do.  Final results are:

find . -not -name "*.gz" -type f -mtime +10 -exec gzip {} \;

The sweet part of this, I can run this command on a daily, weekly or monthly basis and it won’t attempt to recompress files that have already been compressed.  It’s really a minor thing, but why try to process files that don’t need it.

I hope you’ve found this helpful, and can build off of what I’ve shown.  The man page, if you’re so inclined to read it, for find can be read here



By Mark

I work in IT and ride Motorcycles. I do one to support the other.