This issue
I have a folder with about 10,000 files in it. I want to scan through the directories and isolate files that are different from what sits in the database. How do you generate a list of file names of all the files into a single txt file?
Solution
One option is using the list command
ls > filename-list.txt
However, this option will generate a list of the current directory, listing directories by name only and therefore not including file from each of the directories contained within.
Second option is using the find command
Using the find command, there are key extensions to be aware of...
Search by type
You can use the following option if you want to search for a specific file type:
- b – block devices
- c – character devices
- d – directory
- f – regular file
- l – symbolic link
Search by name
Name type - hmmm, how can I use the name extension as I don't want to find a specific file name. As I want to find PDF's, I can leverage the -name and add the following
-name '*.pdf'
Search Files by Size
To search all files less than 4MB
-size -4M
Or if you want to search a size range, such as all files between 3MB and 8MB
-size +3M -size 8M
Use the following options to specify the size in KB, MB, GB, and more.
- b – bytes
- k – Kilobytes
- M – Megabytes
- G – Gigabytes
Search Files by Modification Time
Want to search a file based on its modification, access and change time? To search files inside a directory that have been modified in the last n days... like in the past 15 days
-mtime 15
Inversely, you can use the + symbol if you want to find all files that have been modified more than n days... such as 5 days
-mtime +5
Search Files by directory depth
Manage the depth of directories that find for the n directories... such as 10 directories recursively
-maxdepth 10
Pulling the query together
To begin I'll use directory depth and file type:
find files -type f -maxdepth 10 > filename-list.txt
However, the response showed a failure...
find: warning: you have specified the global option -maxdepth after the argument -type, but global options are not positional, i.e., -maxdepth affects tests specified before it as well as those specified after it. Please specify global options before other arguments.
Moving the position of max depth did the trick
find files -maxdepth 10 -type f > filename-list.txt
I want to also limit the find to PDF files
find files -maxdepth 10 -type f -name '*.pdf' > filename-list.txt