Find the total size of certain files within a directory branch

23

The ultimate answer is:

{ find <DIR> -type f -name "*.<EXT>" -printf "%s+"; echo 0; } | bc

and even faster version, not limited by RAM, but that requires GNU AWK with bignum support:

find <DIR> -type f -name "*.<EXT>" -printf "%s\n" | gawk -M '{t+=$1}END{print t}'

This version has the following features:

  • all capabilities of find to specify the files you’re looking for
  • supports millions of files
    • other answers here are limited by the maximum length of the argument list
  • spawns only 3 simple processes with a minimal pipe throughput
    • many answers here spawn C+N processes, where C is some constant and N is the number of files
  • doesn’t bother with string manipulation
    • this version doesn’t do any grepping, or regexing
    • well, find does a simple wildcard matching of filenames
  • optionally formats the sum into a human-readable form (eg. 5.5K, 176.7M, …)
    • to do that append | numfmt --to=si