Why does `ls -l` count more files than me?












25















Apparently I cannot count. I think there are three files in /media



$ tree /media
/media
├── foo
├── onex
└── zanna
3 directories, 0 files


However, ls -l finds 12.



$ ls -l /media
total 12
drwxr-xr-x 2 root root 4096 Jul 31 20:57 foo
drwxrwxr-x 2 root root 4096 Jun 26 06:36 onex
drwxr-x---+ 2 root root 4096 Aug 7 21:17 zanna


And, if I do ls -la I get only . and .. in addition to the above, but the count is total 20



What's the explanation?










share|improve this question





























    25















    Apparently I cannot count. I think there are three files in /media



    $ tree /media
    /media
    ├── foo
    ├── onex
    └── zanna
    3 directories, 0 files


    However, ls -l finds 12.



    $ ls -l /media
    total 12
    drwxr-xr-x 2 root root 4096 Jul 31 20:57 foo
    drwxrwxr-x 2 root root 4096 Jun 26 06:36 onex
    drwxr-x---+ 2 root root 4096 Aug 7 21:17 zanna


    And, if I do ls -la I get only . and .. in addition to the above, but the count is total 20



    What's the explanation?










    share|improve this question



























      25












      25








      25


      5






      Apparently I cannot count. I think there are three files in /media



      $ tree /media
      /media
      ├── foo
      ├── onex
      └── zanna
      3 directories, 0 files


      However, ls -l finds 12.



      $ ls -l /media
      total 12
      drwxr-xr-x 2 root root 4096 Jul 31 20:57 foo
      drwxrwxr-x 2 root root 4096 Jun 26 06:36 onex
      drwxr-x---+ 2 root root 4096 Aug 7 21:17 zanna


      And, if I do ls -la I get only . and .. in addition to the above, but the count is total 20



      What's the explanation?










      share|improve this question
















      Apparently I cannot count. I think there are three files in /media



      $ tree /media
      /media
      ├── foo
      ├── onex
      └── zanna
      3 directories, 0 files


      However, ls -l finds 12.



      $ ls -l /media
      total 12
      drwxr-xr-x 2 root root 4096 Jul 31 20:57 foo
      drwxrwxr-x 2 root root 4096 Jun 26 06:36 onex
      drwxr-x---+ 2 root root 4096 Aug 7 21:17 zanna


      And, if I do ls -la I get only . and .. in addition to the above, but the count is total 20



      What's the explanation?







      command-line files ls






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Oct 10 '17 at 18:11







      Zanna

















      asked Aug 10 '16 at 14:44









      ZannaZanna

      50.5k13133241




      50.5k13133241






















          2 Answers
          2






          active

          oldest

          votes


















          33














          The 12 you see is not the number of files, but the number of disk blocks consumed.



          From info coreutils 'ls invocation':



           For each directory that is listed, preface the files with a line
          `total BLOCKS', where BLOCKS is the total disk allocation for all
          files in that directory. The block size currently defaults to 1024
          bytes, but this can be overridden (*note Block size::). The
          BLOCKS computed counts each hard link separately; this is arguably
          a deficiency.


          The total goes from 12 to 20 when you use ls -la instead of ls -l because you are counting two additional directories: . and ... You are using four disk blocks for each (empty) directory, so your total goes from 3 × 4 to 5 × 4. (In all likelihood, you are using one disk block of 4096 bytes for each directory; as the info page indicates, the utility does not check the disk format, but assumes a block size of 1024 unless instructed otherwise.)



          If you want to simply get the number of files, you might try something like



          ls | wc -l





          share|improve this answer





















          • 13





            ls | wc -l will fail if there are files with a newline in the filename. This is more resilient: find . -mindepth 1 -maxdepth 1 -printf . | wc -c

            – Flimm
            Aug 10 '16 at 19:18






          • 20





            "if file names have a new line in them" ... shudder

            – Petah
            Aug 11 '16 at 3:54






          • 8





            As man ls will tell you, you can avoid control chars with -b (escapes them) or -q (omits them). So for counting, ls -1q | wc -l is safe and accurate for showing non-hidden files. ls -1qA | wc -l to count hidden files (but not . and ..). I'm using -1 instead of -l because that should be faster.

            – Oli
            Aug 11 '16 at 15:24





















          18














          user4556274 has already answered the why. My answer serves only to provide additional information for how to properly count files.



          In the Unix community the general consensus is that parsing the output of ls is a very very bad idea, since filenames can contain control characters or hidden characters. For example, due to a newline character in a filename, we have ls | wc -l tell us there's 5 lines in the output of ls (which it does have), but in reality there's only 4 files in the directory.



          $> touch  FILE$'n'NAME                                                       
          $> ls
          file1.txt file2.txt file3.txt FILE?NAME
          $> ls | wc -l
          5


          Method #1: find utility



          The find command, which is typically used for working around parsing filenames, can help us here by printing the inode number. Be it a directory or a file, it only has one unique inode number. Thus, using -printf "%in" and excluding . via -not -name "." we can have an accurate count of the files. (Note the use of -maxdepth 1 to prevent recursive descending into subdirectories)



          $> find  -maxdepth 1 -not -name "." -print                                    
          ./file2.txt
          ./file1.txt
          ./FILE?NAME
          ./file3.txt
          $> find -maxdepth 1 -not -name "." -printf "%in" | wc -l
          4


          Method #2 : globstar



          Simple, quick, and mostly portable way:



          $ set -- * 
          $ echo $#
          228


          set command is used to set positional parameters of the shell ( the $<INTEGER> variables, as in echo $1 ). This is often used to work around /bin/sh limitation of lacking arrays. A version that performs extra checks can be found in Gille's answer over on Unix&Linux.



          In shells that support arrays, such as bash, we can use



          items=( dir/* )
          echo ${#items[@]}


          as proposed by steeldriver in the comments.



          Similar trick to find method which used wc and globstar can be used with stat to count inode numbers per line:



          $> LC_ALL=C stat ./* --printf "%in" | wc -l                                          
          4


          An alternative approach is to use a wildcard in for loop. (Note, this test uses a different directory to test whether this approach descends into subdirectories, which it does not - 16 is the verified number of items in my ~/bin )



          $> count=0; for item in ~/bin/* ; do count=$(($count+1)) ; echo $count ; done | tail -n 1                                
          16


          Method #3: other languages/interpreters



          Python can also deal with problematic filenames via printing the length of a list given my os.listdir() function (which is non-recursive, and will only list items in the directory given as argument).



          $> python -c "import os ; print os.listdir('.')"                              
          ['file2.txt', 'file1.txt', 'FILEnNAME', 'file3.txt']
          $> python -c "import os ; print(len(os.listdir('.')))"
          4


          See also




          • What's the most resource efficient way to count how many files are in a directory?






          share|improve this answer





















          • 2





            In bash, another option would be to use an array e.g. items=( dir/* ); echo ${#items[@]} (adding shopt -s dotglob to include hidden files).

            – steeldriver
            Aug 11 '16 at 4:01






          • 1





            Printing inode numbers makes it easy to filter hardlinks if desired, with find | sort -u | wc -l.

            – Peter Cordes
            Aug 11 '16 at 4:06











          • @steeldriver: I think the bash-array method is unlikely to be faster. If you want it to be recursive, you need to use items=( dir/** ) (with shopt -s globstar), but bash doesn't take advantage of extra metadata from readdir, so it stats every directory entry to see if it is a directory itself. Many filesystems do store the filetype in the directory entry, so readdir can return it without accessing the inodes. (e.g. the latest non-default XFS has this, and I think ext4 has had it for longer.) If you strace find, you'll see a lot fewer stat system calls than stracing bash.

            – Peter Cordes
            Aug 11 '16 at 4:11






          • 2





            Why not just use print(len(os.listdir('.')))? Fewer characters to type and also avoids accessing doubly-underscored attributes.

            – edwinksl
            Aug 11 '16 at 5:05








          • 1





            @edwinksl edited , thx

            – Sergiy Kolodyazhnyy
            Aug 11 '16 at 5:46













          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "89"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f810563%2fwhy-does-ls-l-count-more-files-than-me%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          33














          The 12 you see is not the number of files, but the number of disk blocks consumed.



          From info coreutils 'ls invocation':



           For each directory that is listed, preface the files with a line
          `total BLOCKS', where BLOCKS is the total disk allocation for all
          files in that directory. The block size currently defaults to 1024
          bytes, but this can be overridden (*note Block size::). The
          BLOCKS computed counts each hard link separately; this is arguably
          a deficiency.


          The total goes from 12 to 20 when you use ls -la instead of ls -l because you are counting two additional directories: . and ... You are using four disk blocks for each (empty) directory, so your total goes from 3 × 4 to 5 × 4. (In all likelihood, you are using one disk block of 4096 bytes for each directory; as the info page indicates, the utility does not check the disk format, but assumes a block size of 1024 unless instructed otherwise.)



          If you want to simply get the number of files, you might try something like



          ls | wc -l





          share|improve this answer





















          • 13





            ls | wc -l will fail if there are files with a newline in the filename. This is more resilient: find . -mindepth 1 -maxdepth 1 -printf . | wc -c

            – Flimm
            Aug 10 '16 at 19:18






          • 20





            "if file names have a new line in them" ... shudder

            – Petah
            Aug 11 '16 at 3:54






          • 8





            As man ls will tell you, you can avoid control chars with -b (escapes them) or -q (omits them). So for counting, ls -1q | wc -l is safe and accurate for showing non-hidden files. ls -1qA | wc -l to count hidden files (but not . and ..). I'm using -1 instead of -l because that should be faster.

            – Oli
            Aug 11 '16 at 15:24


















          33














          The 12 you see is not the number of files, but the number of disk blocks consumed.



          From info coreutils 'ls invocation':



           For each directory that is listed, preface the files with a line
          `total BLOCKS', where BLOCKS is the total disk allocation for all
          files in that directory. The block size currently defaults to 1024
          bytes, but this can be overridden (*note Block size::). The
          BLOCKS computed counts each hard link separately; this is arguably
          a deficiency.


          The total goes from 12 to 20 when you use ls -la instead of ls -l because you are counting two additional directories: . and ... You are using four disk blocks for each (empty) directory, so your total goes from 3 × 4 to 5 × 4. (In all likelihood, you are using one disk block of 4096 bytes for each directory; as the info page indicates, the utility does not check the disk format, but assumes a block size of 1024 unless instructed otherwise.)



          If you want to simply get the number of files, you might try something like



          ls | wc -l





          share|improve this answer





















          • 13





            ls | wc -l will fail if there are files with a newline in the filename. This is more resilient: find . -mindepth 1 -maxdepth 1 -printf . | wc -c

            – Flimm
            Aug 10 '16 at 19:18






          • 20





            "if file names have a new line in them" ... shudder

            – Petah
            Aug 11 '16 at 3:54






          • 8





            As man ls will tell you, you can avoid control chars with -b (escapes them) or -q (omits them). So for counting, ls -1q | wc -l is safe and accurate for showing non-hidden files. ls -1qA | wc -l to count hidden files (but not . and ..). I'm using -1 instead of -l because that should be faster.

            – Oli
            Aug 11 '16 at 15:24
















          33












          33








          33







          The 12 you see is not the number of files, but the number of disk blocks consumed.



          From info coreutils 'ls invocation':



           For each directory that is listed, preface the files with a line
          `total BLOCKS', where BLOCKS is the total disk allocation for all
          files in that directory. The block size currently defaults to 1024
          bytes, but this can be overridden (*note Block size::). The
          BLOCKS computed counts each hard link separately; this is arguably
          a deficiency.


          The total goes from 12 to 20 when you use ls -la instead of ls -l because you are counting two additional directories: . and ... You are using four disk blocks for each (empty) directory, so your total goes from 3 × 4 to 5 × 4. (In all likelihood, you are using one disk block of 4096 bytes for each directory; as the info page indicates, the utility does not check the disk format, but assumes a block size of 1024 unless instructed otherwise.)



          If you want to simply get the number of files, you might try something like



          ls | wc -l





          share|improve this answer















          The 12 you see is not the number of files, but the number of disk blocks consumed.



          From info coreutils 'ls invocation':



           For each directory that is listed, preface the files with a line
          `total BLOCKS', where BLOCKS is the total disk allocation for all
          files in that directory. The block size currently defaults to 1024
          bytes, but this can be overridden (*note Block size::). The
          BLOCKS computed counts each hard link separately; this is arguably
          a deficiency.


          The total goes from 12 to 20 when you use ls -la instead of ls -l because you are counting two additional directories: . and ... You are using four disk blocks for each (empty) directory, so your total goes from 3 × 4 to 5 × 4. (In all likelihood, you are using one disk block of 4096 bytes for each directory; as the info page indicates, the utility does not check the disk format, but assumes a block size of 1024 unless instructed otherwise.)



          If you want to simply get the number of files, you might try something like



          ls | wc -l






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Aug 10 '16 at 15:17

























          answered Aug 10 '16 at 14:49









          user4556274user4556274

          2,509814




          2,509814








          • 13





            ls | wc -l will fail if there are files with a newline in the filename. This is more resilient: find . -mindepth 1 -maxdepth 1 -printf . | wc -c

            – Flimm
            Aug 10 '16 at 19:18






          • 20





            "if file names have a new line in them" ... shudder

            – Petah
            Aug 11 '16 at 3:54






          • 8





            As man ls will tell you, you can avoid control chars with -b (escapes them) or -q (omits them). So for counting, ls -1q | wc -l is safe and accurate for showing non-hidden files. ls -1qA | wc -l to count hidden files (but not . and ..). I'm using -1 instead of -l because that should be faster.

            – Oli
            Aug 11 '16 at 15:24
















          • 13





            ls | wc -l will fail if there are files with a newline in the filename. This is more resilient: find . -mindepth 1 -maxdepth 1 -printf . | wc -c

            – Flimm
            Aug 10 '16 at 19:18






          • 20





            "if file names have a new line in them" ... shudder

            – Petah
            Aug 11 '16 at 3:54






          • 8





            As man ls will tell you, you can avoid control chars with -b (escapes them) or -q (omits them). So for counting, ls -1q | wc -l is safe and accurate for showing non-hidden files. ls -1qA | wc -l to count hidden files (but not . and ..). I'm using -1 instead of -l because that should be faster.

            – Oli
            Aug 11 '16 at 15:24










          13




          13





          ls | wc -l will fail if there are files with a newline in the filename. This is more resilient: find . -mindepth 1 -maxdepth 1 -printf . | wc -c

          – Flimm
          Aug 10 '16 at 19:18





          ls | wc -l will fail if there are files with a newline in the filename. This is more resilient: find . -mindepth 1 -maxdepth 1 -printf . | wc -c

          – Flimm
          Aug 10 '16 at 19:18




          20




          20





          "if file names have a new line in them" ... shudder

          – Petah
          Aug 11 '16 at 3:54





          "if file names have a new line in them" ... shudder

          – Petah
          Aug 11 '16 at 3:54




          8




          8





          As man ls will tell you, you can avoid control chars with -b (escapes them) or -q (omits them). So for counting, ls -1q | wc -l is safe and accurate for showing non-hidden files. ls -1qA | wc -l to count hidden files (but not . and ..). I'm using -1 instead of -l because that should be faster.

          – Oli
          Aug 11 '16 at 15:24







          As man ls will tell you, you can avoid control chars with -b (escapes them) or -q (omits them). So for counting, ls -1q | wc -l is safe and accurate for showing non-hidden files. ls -1qA | wc -l to count hidden files (but not . and ..). I'm using -1 instead of -l because that should be faster.

          – Oli
          Aug 11 '16 at 15:24















          18














          user4556274 has already answered the why. My answer serves only to provide additional information for how to properly count files.



          In the Unix community the general consensus is that parsing the output of ls is a very very bad idea, since filenames can contain control characters or hidden characters. For example, due to a newline character in a filename, we have ls | wc -l tell us there's 5 lines in the output of ls (which it does have), but in reality there's only 4 files in the directory.



          $> touch  FILE$'n'NAME                                                       
          $> ls
          file1.txt file2.txt file3.txt FILE?NAME
          $> ls | wc -l
          5


          Method #1: find utility



          The find command, which is typically used for working around parsing filenames, can help us here by printing the inode number. Be it a directory or a file, it only has one unique inode number. Thus, using -printf "%in" and excluding . via -not -name "." we can have an accurate count of the files. (Note the use of -maxdepth 1 to prevent recursive descending into subdirectories)



          $> find  -maxdepth 1 -not -name "." -print                                    
          ./file2.txt
          ./file1.txt
          ./FILE?NAME
          ./file3.txt
          $> find -maxdepth 1 -not -name "." -printf "%in" | wc -l
          4


          Method #2 : globstar



          Simple, quick, and mostly portable way:



          $ set -- * 
          $ echo $#
          228


          set command is used to set positional parameters of the shell ( the $<INTEGER> variables, as in echo $1 ). This is often used to work around /bin/sh limitation of lacking arrays. A version that performs extra checks can be found in Gille's answer over on Unix&Linux.



          In shells that support arrays, such as bash, we can use



          items=( dir/* )
          echo ${#items[@]}


          as proposed by steeldriver in the comments.



          Similar trick to find method which used wc and globstar can be used with stat to count inode numbers per line:



          $> LC_ALL=C stat ./* --printf "%in" | wc -l                                          
          4


          An alternative approach is to use a wildcard in for loop. (Note, this test uses a different directory to test whether this approach descends into subdirectories, which it does not - 16 is the verified number of items in my ~/bin )



          $> count=0; for item in ~/bin/* ; do count=$(($count+1)) ; echo $count ; done | tail -n 1                                
          16


          Method #3: other languages/interpreters



          Python can also deal with problematic filenames via printing the length of a list given my os.listdir() function (which is non-recursive, and will only list items in the directory given as argument).



          $> python -c "import os ; print os.listdir('.')"                              
          ['file2.txt', 'file1.txt', 'FILEnNAME', 'file3.txt']
          $> python -c "import os ; print(len(os.listdir('.')))"
          4


          See also




          • What's the most resource efficient way to count how many files are in a directory?






          share|improve this answer





















          • 2





            In bash, another option would be to use an array e.g. items=( dir/* ); echo ${#items[@]} (adding shopt -s dotglob to include hidden files).

            – steeldriver
            Aug 11 '16 at 4:01






          • 1





            Printing inode numbers makes it easy to filter hardlinks if desired, with find | sort -u | wc -l.

            – Peter Cordes
            Aug 11 '16 at 4:06











          • @steeldriver: I think the bash-array method is unlikely to be faster. If you want it to be recursive, you need to use items=( dir/** ) (with shopt -s globstar), but bash doesn't take advantage of extra metadata from readdir, so it stats every directory entry to see if it is a directory itself. Many filesystems do store the filetype in the directory entry, so readdir can return it without accessing the inodes. (e.g. the latest non-default XFS has this, and I think ext4 has had it for longer.) If you strace find, you'll see a lot fewer stat system calls than stracing bash.

            – Peter Cordes
            Aug 11 '16 at 4:11






          • 2





            Why not just use print(len(os.listdir('.')))? Fewer characters to type and also avoids accessing doubly-underscored attributes.

            – edwinksl
            Aug 11 '16 at 5:05








          • 1





            @edwinksl edited , thx

            – Sergiy Kolodyazhnyy
            Aug 11 '16 at 5:46


















          18














          user4556274 has already answered the why. My answer serves only to provide additional information for how to properly count files.



          In the Unix community the general consensus is that parsing the output of ls is a very very bad idea, since filenames can contain control characters or hidden characters. For example, due to a newline character in a filename, we have ls | wc -l tell us there's 5 lines in the output of ls (which it does have), but in reality there's only 4 files in the directory.



          $> touch  FILE$'n'NAME                                                       
          $> ls
          file1.txt file2.txt file3.txt FILE?NAME
          $> ls | wc -l
          5


          Method #1: find utility



          The find command, which is typically used for working around parsing filenames, can help us here by printing the inode number. Be it a directory or a file, it only has one unique inode number. Thus, using -printf "%in" and excluding . via -not -name "." we can have an accurate count of the files. (Note the use of -maxdepth 1 to prevent recursive descending into subdirectories)



          $> find  -maxdepth 1 -not -name "." -print                                    
          ./file2.txt
          ./file1.txt
          ./FILE?NAME
          ./file3.txt
          $> find -maxdepth 1 -not -name "." -printf "%in" | wc -l
          4


          Method #2 : globstar



          Simple, quick, and mostly portable way:



          $ set -- * 
          $ echo $#
          228


          set command is used to set positional parameters of the shell ( the $<INTEGER> variables, as in echo $1 ). This is often used to work around /bin/sh limitation of lacking arrays. A version that performs extra checks can be found in Gille's answer over on Unix&Linux.



          In shells that support arrays, such as bash, we can use



          items=( dir/* )
          echo ${#items[@]}


          as proposed by steeldriver in the comments.



          Similar trick to find method which used wc and globstar can be used with stat to count inode numbers per line:



          $> LC_ALL=C stat ./* --printf "%in" | wc -l                                          
          4


          An alternative approach is to use a wildcard in for loop. (Note, this test uses a different directory to test whether this approach descends into subdirectories, which it does not - 16 is the verified number of items in my ~/bin )



          $> count=0; for item in ~/bin/* ; do count=$(($count+1)) ; echo $count ; done | tail -n 1                                
          16


          Method #3: other languages/interpreters



          Python can also deal with problematic filenames via printing the length of a list given my os.listdir() function (which is non-recursive, and will only list items in the directory given as argument).



          $> python -c "import os ; print os.listdir('.')"                              
          ['file2.txt', 'file1.txt', 'FILEnNAME', 'file3.txt']
          $> python -c "import os ; print(len(os.listdir('.')))"
          4


          See also




          • What's the most resource efficient way to count how many files are in a directory?






          share|improve this answer





















          • 2





            In bash, another option would be to use an array e.g. items=( dir/* ); echo ${#items[@]} (adding shopt -s dotglob to include hidden files).

            – steeldriver
            Aug 11 '16 at 4:01






          • 1





            Printing inode numbers makes it easy to filter hardlinks if desired, with find | sort -u | wc -l.

            – Peter Cordes
            Aug 11 '16 at 4:06











          • @steeldriver: I think the bash-array method is unlikely to be faster. If you want it to be recursive, you need to use items=( dir/** ) (with shopt -s globstar), but bash doesn't take advantage of extra metadata from readdir, so it stats every directory entry to see if it is a directory itself. Many filesystems do store the filetype in the directory entry, so readdir can return it without accessing the inodes. (e.g. the latest non-default XFS has this, and I think ext4 has had it for longer.) If you strace find, you'll see a lot fewer stat system calls than stracing bash.

            – Peter Cordes
            Aug 11 '16 at 4:11






          • 2





            Why not just use print(len(os.listdir('.')))? Fewer characters to type and also avoids accessing doubly-underscored attributes.

            – edwinksl
            Aug 11 '16 at 5:05








          • 1





            @edwinksl edited , thx

            – Sergiy Kolodyazhnyy
            Aug 11 '16 at 5:46
















          18












          18








          18







          user4556274 has already answered the why. My answer serves only to provide additional information for how to properly count files.



          In the Unix community the general consensus is that parsing the output of ls is a very very bad idea, since filenames can contain control characters or hidden characters. For example, due to a newline character in a filename, we have ls | wc -l tell us there's 5 lines in the output of ls (which it does have), but in reality there's only 4 files in the directory.



          $> touch  FILE$'n'NAME                                                       
          $> ls
          file1.txt file2.txt file3.txt FILE?NAME
          $> ls | wc -l
          5


          Method #1: find utility



          The find command, which is typically used for working around parsing filenames, can help us here by printing the inode number. Be it a directory or a file, it only has one unique inode number. Thus, using -printf "%in" and excluding . via -not -name "." we can have an accurate count of the files. (Note the use of -maxdepth 1 to prevent recursive descending into subdirectories)



          $> find  -maxdepth 1 -not -name "." -print                                    
          ./file2.txt
          ./file1.txt
          ./FILE?NAME
          ./file3.txt
          $> find -maxdepth 1 -not -name "." -printf "%in" | wc -l
          4


          Method #2 : globstar



          Simple, quick, and mostly portable way:



          $ set -- * 
          $ echo $#
          228


          set command is used to set positional parameters of the shell ( the $<INTEGER> variables, as in echo $1 ). This is often used to work around /bin/sh limitation of lacking arrays. A version that performs extra checks can be found in Gille's answer over on Unix&Linux.



          In shells that support arrays, such as bash, we can use



          items=( dir/* )
          echo ${#items[@]}


          as proposed by steeldriver in the comments.



          Similar trick to find method which used wc and globstar can be used with stat to count inode numbers per line:



          $> LC_ALL=C stat ./* --printf "%in" | wc -l                                          
          4


          An alternative approach is to use a wildcard in for loop. (Note, this test uses a different directory to test whether this approach descends into subdirectories, which it does not - 16 is the verified number of items in my ~/bin )



          $> count=0; for item in ~/bin/* ; do count=$(($count+1)) ; echo $count ; done | tail -n 1                                
          16


          Method #3: other languages/interpreters



          Python can also deal with problematic filenames via printing the length of a list given my os.listdir() function (which is non-recursive, and will only list items in the directory given as argument).



          $> python -c "import os ; print os.listdir('.')"                              
          ['file2.txt', 'file1.txt', 'FILEnNAME', 'file3.txt']
          $> python -c "import os ; print(len(os.listdir('.')))"
          4


          See also




          • What's the most resource efficient way to count how many files are in a directory?






          share|improve this answer















          user4556274 has already answered the why. My answer serves only to provide additional information for how to properly count files.



          In the Unix community the general consensus is that parsing the output of ls is a very very bad idea, since filenames can contain control characters or hidden characters. For example, due to a newline character in a filename, we have ls | wc -l tell us there's 5 lines in the output of ls (which it does have), but in reality there's only 4 files in the directory.



          $> touch  FILE$'n'NAME                                                       
          $> ls
          file1.txt file2.txt file3.txt FILE?NAME
          $> ls | wc -l
          5


          Method #1: find utility



          The find command, which is typically used for working around parsing filenames, can help us here by printing the inode number. Be it a directory or a file, it only has one unique inode number. Thus, using -printf "%in" and excluding . via -not -name "." we can have an accurate count of the files. (Note the use of -maxdepth 1 to prevent recursive descending into subdirectories)



          $> find  -maxdepth 1 -not -name "." -print                                    
          ./file2.txt
          ./file1.txt
          ./FILE?NAME
          ./file3.txt
          $> find -maxdepth 1 -not -name "." -printf "%in" | wc -l
          4


          Method #2 : globstar



          Simple, quick, and mostly portable way:



          $ set -- * 
          $ echo $#
          228


          set command is used to set positional parameters of the shell ( the $<INTEGER> variables, as in echo $1 ). This is often used to work around /bin/sh limitation of lacking arrays. A version that performs extra checks can be found in Gille's answer over on Unix&Linux.



          In shells that support arrays, such as bash, we can use



          items=( dir/* )
          echo ${#items[@]}


          as proposed by steeldriver in the comments.



          Similar trick to find method which used wc and globstar can be used with stat to count inode numbers per line:



          $> LC_ALL=C stat ./* --printf "%in" | wc -l                                          
          4


          An alternative approach is to use a wildcard in for loop. (Note, this test uses a different directory to test whether this approach descends into subdirectories, which it does not - 16 is the verified number of items in my ~/bin )



          $> count=0; for item in ~/bin/* ; do count=$(($count+1)) ; echo $count ; done | tail -n 1                                
          16


          Method #3: other languages/interpreters



          Python can also deal with problematic filenames via printing the length of a list given my os.listdir() function (which is non-recursive, and will only list items in the directory given as argument).



          $> python -c "import os ; print os.listdir('.')"                              
          ['file2.txt', 'file1.txt', 'FILEnNAME', 'file3.txt']
          $> python -c "import os ; print(len(os.listdir('.')))"
          4


          See also




          • What's the most resource efficient way to count how many files are in a directory?







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jan 4 at 9:59

























          answered Aug 10 '16 at 20:51









          Sergiy KolodyazhnyySergiy Kolodyazhnyy

          71.1k9147312




          71.1k9147312








          • 2





            In bash, another option would be to use an array e.g. items=( dir/* ); echo ${#items[@]} (adding shopt -s dotglob to include hidden files).

            – steeldriver
            Aug 11 '16 at 4:01






          • 1





            Printing inode numbers makes it easy to filter hardlinks if desired, with find | sort -u | wc -l.

            – Peter Cordes
            Aug 11 '16 at 4:06











          • @steeldriver: I think the bash-array method is unlikely to be faster. If you want it to be recursive, you need to use items=( dir/** ) (with shopt -s globstar), but bash doesn't take advantage of extra metadata from readdir, so it stats every directory entry to see if it is a directory itself. Many filesystems do store the filetype in the directory entry, so readdir can return it without accessing the inodes. (e.g. the latest non-default XFS has this, and I think ext4 has had it for longer.) If you strace find, you'll see a lot fewer stat system calls than stracing bash.

            – Peter Cordes
            Aug 11 '16 at 4:11






          • 2





            Why not just use print(len(os.listdir('.')))? Fewer characters to type and also avoids accessing doubly-underscored attributes.

            – edwinksl
            Aug 11 '16 at 5:05








          • 1





            @edwinksl edited , thx

            – Sergiy Kolodyazhnyy
            Aug 11 '16 at 5:46
















          • 2





            In bash, another option would be to use an array e.g. items=( dir/* ); echo ${#items[@]} (adding shopt -s dotglob to include hidden files).

            – steeldriver
            Aug 11 '16 at 4:01






          • 1





            Printing inode numbers makes it easy to filter hardlinks if desired, with find | sort -u | wc -l.

            – Peter Cordes
            Aug 11 '16 at 4:06











          • @steeldriver: I think the bash-array method is unlikely to be faster. If you want it to be recursive, you need to use items=( dir/** ) (with shopt -s globstar), but bash doesn't take advantage of extra metadata from readdir, so it stats every directory entry to see if it is a directory itself. Many filesystems do store the filetype in the directory entry, so readdir can return it without accessing the inodes. (e.g. the latest non-default XFS has this, and I think ext4 has had it for longer.) If you strace find, you'll see a lot fewer stat system calls than stracing bash.

            – Peter Cordes
            Aug 11 '16 at 4:11






          • 2





            Why not just use print(len(os.listdir('.')))? Fewer characters to type and also avoids accessing doubly-underscored attributes.

            – edwinksl
            Aug 11 '16 at 5:05








          • 1





            @edwinksl edited , thx

            – Sergiy Kolodyazhnyy
            Aug 11 '16 at 5:46










          2




          2





          In bash, another option would be to use an array e.g. items=( dir/* ); echo ${#items[@]} (adding shopt -s dotglob to include hidden files).

          – steeldriver
          Aug 11 '16 at 4:01





          In bash, another option would be to use an array e.g. items=( dir/* ); echo ${#items[@]} (adding shopt -s dotglob to include hidden files).

          – steeldriver
          Aug 11 '16 at 4:01




          1




          1





          Printing inode numbers makes it easy to filter hardlinks if desired, with find | sort -u | wc -l.

          – Peter Cordes
          Aug 11 '16 at 4:06





          Printing inode numbers makes it easy to filter hardlinks if desired, with find | sort -u | wc -l.

          – Peter Cordes
          Aug 11 '16 at 4:06













          @steeldriver: I think the bash-array method is unlikely to be faster. If you want it to be recursive, you need to use items=( dir/** ) (with shopt -s globstar), but bash doesn't take advantage of extra metadata from readdir, so it stats every directory entry to see if it is a directory itself. Many filesystems do store the filetype in the directory entry, so readdir can return it without accessing the inodes. (e.g. the latest non-default XFS has this, and I think ext4 has had it for longer.) If you strace find, you'll see a lot fewer stat system calls than stracing bash.

          – Peter Cordes
          Aug 11 '16 at 4:11





          @steeldriver: I think the bash-array method is unlikely to be faster. If you want it to be recursive, you need to use items=( dir/** ) (with shopt -s globstar), but bash doesn't take advantage of extra metadata from readdir, so it stats every directory entry to see if it is a directory itself. Many filesystems do store the filetype in the directory entry, so readdir can return it without accessing the inodes. (e.g. the latest non-default XFS has this, and I think ext4 has had it for longer.) If you strace find, you'll see a lot fewer stat system calls than stracing bash.

          – Peter Cordes
          Aug 11 '16 at 4:11




          2




          2





          Why not just use print(len(os.listdir('.')))? Fewer characters to type and also avoids accessing doubly-underscored attributes.

          – edwinksl
          Aug 11 '16 at 5:05







          Why not just use print(len(os.listdir('.')))? Fewer characters to type and also avoids accessing doubly-underscored attributes.

          – edwinksl
          Aug 11 '16 at 5:05






          1




          1





          @edwinksl edited , thx

          – Sergiy Kolodyazhnyy
          Aug 11 '16 at 5:46







          @edwinksl edited , thx

          – Sergiy Kolodyazhnyy
          Aug 11 '16 at 5:46




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Ask Ubuntu!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f810563%2fwhy-does-ls-l-count-more-files-than-me%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          flock() on closed filehandle LOCK_FILE at /usr/bin/apt-mirror

          Mangá

          Eduardo VII do Reino Unido