Find Missing Numbers [on hold]











up vote
2
down vote

favorite












I have a big list of ordered files with name like this (videos)



S1-E18-(Date)-(Title)-(Random numbers).mp4


Here is the example of list



S1-E1-20100526-title-of-video-1400316375.mp4
S1-E3-20100547-title-of-video-15457547.mp4
S10-E5-20100463-title-of-video-14467457.mp4


In this case its easy to see that the files S1-E2 and S10-E4 are missing.
but if I have a big list then how can I find the missing files.
(Leave Season number S1, S2) just need to check E means episode number



The largest existing file's number is S50-E2184 and
The Smallest existing file's number is S1-E1










share|improve this question















put on hold as off-topic by karel, N0rbert, dessert, Eric Carvalho, George Udosen 2 days ago


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This is not about Ubuntu. Questions about other Linux distributions can be asked on Unix & Linux, those about Windows on Super User, those about Apple products on Ask Different and generic programming questions on Stack Overflow." – karel, N0rbert, Eric Carvalho, George Udosen

If this question can be reworded to fit the rules in the help center, please edit the question.













  • Hi! This is a question for stack overflow. People here only really answer questions about Ubuntu directly.
    – Lewis Smith
    Nov 21 at 9:51






  • 1




    @LewisSmith not really, text processing is very much on topic here in practice. Just look at the numerous posts on AU on the topic.
    – Jacob Vlijm
    Nov 21 at 10:10






  • 1




    Are the lines/numbers sorted in the file?
    – Jacob Vlijm
    Nov 21 at 10:11










  • @JacobVlijm - My apologies. In that case ignore me.
    – Lewis Smith
    Nov 21 at 10:22






  • 1




    Agreed, @EliShain please clarify if you are looking for a strictly python based solution, or if another language would be acceptable to you.
    – Jacob Vlijm
    Nov 21 at 13:35















up vote
2
down vote

favorite












I have a big list of ordered files with name like this (videos)



S1-E18-(Date)-(Title)-(Random numbers).mp4


Here is the example of list



S1-E1-20100526-title-of-video-1400316375.mp4
S1-E3-20100547-title-of-video-15457547.mp4
S10-E5-20100463-title-of-video-14467457.mp4


In this case its easy to see that the files S1-E2 and S10-E4 are missing.
but if I have a big list then how can I find the missing files.
(Leave Season number S1, S2) just need to check E means episode number



The largest existing file's number is S50-E2184 and
The Smallest existing file's number is S1-E1










share|improve this question















put on hold as off-topic by karel, N0rbert, dessert, Eric Carvalho, George Udosen 2 days ago


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This is not about Ubuntu. Questions about other Linux distributions can be asked on Unix & Linux, those about Windows on Super User, those about Apple products on Ask Different and generic programming questions on Stack Overflow." – karel, N0rbert, Eric Carvalho, George Udosen

If this question can be reworded to fit the rules in the help center, please edit the question.













  • Hi! This is a question for stack overflow. People here only really answer questions about Ubuntu directly.
    – Lewis Smith
    Nov 21 at 9:51






  • 1




    @LewisSmith not really, text processing is very much on topic here in practice. Just look at the numerous posts on AU on the topic.
    – Jacob Vlijm
    Nov 21 at 10:10






  • 1




    Are the lines/numbers sorted in the file?
    – Jacob Vlijm
    Nov 21 at 10:11










  • @JacobVlijm - My apologies. In that case ignore me.
    – Lewis Smith
    Nov 21 at 10:22






  • 1




    Agreed, @EliShain please clarify if you are looking for a strictly python based solution, or if another language would be acceptable to you.
    – Jacob Vlijm
    Nov 21 at 13:35













up vote
2
down vote

favorite









up vote
2
down vote

favorite











I have a big list of ordered files with name like this (videos)



S1-E18-(Date)-(Title)-(Random numbers).mp4


Here is the example of list



S1-E1-20100526-title-of-video-1400316375.mp4
S1-E3-20100547-title-of-video-15457547.mp4
S10-E5-20100463-title-of-video-14467457.mp4


In this case its easy to see that the files S1-E2 and S10-E4 are missing.
but if I have a big list then how can I find the missing files.
(Leave Season number S1, S2) just need to check E means episode number



The largest existing file's number is S50-E2184 and
The Smallest existing file's number is S1-E1










share|improve this question















I have a big list of ordered files with name like this (videos)



S1-E18-(Date)-(Title)-(Random numbers).mp4


Here is the example of list



S1-E1-20100526-title-of-video-1400316375.mp4
S1-E3-20100547-title-of-video-15457547.mp4
S10-E5-20100463-title-of-video-14467457.mp4


In this case its easy to see that the files S1-E2 and S10-E4 are missing.
but if I have a big list then how can I find the missing files.
(Leave Season number S1, S2) just need to check E means episode number



The largest existing file's number is S50-E2184 and
The Smallest existing file's number is S1-E1







command-line text-processing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 at 7:20









muru

134k19282482




134k19282482










asked Nov 21 at 9:35









Eli Shain

164




164




put on hold as off-topic by karel, N0rbert, dessert, Eric Carvalho, George Udosen 2 days ago


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This is not about Ubuntu. Questions about other Linux distributions can be asked on Unix & Linux, those about Windows on Super User, those about Apple products on Ask Different and generic programming questions on Stack Overflow." – karel, N0rbert, Eric Carvalho, George Udosen

If this question can be reworded to fit the rules in the help center, please edit the question.




put on hold as off-topic by karel, N0rbert, dessert, Eric Carvalho, George Udosen 2 days ago


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This is not about Ubuntu. Questions about other Linux distributions can be asked on Unix & Linux, those about Windows on Super User, those about Apple products on Ask Different and generic programming questions on Stack Overflow." – karel, N0rbert, Eric Carvalho, George Udosen

If this question can be reworded to fit the rules in the help center, please edit the question.












  • Hi! This is a question for stack overflow. People here only really answer questions about Ubuntu directly.
    – Lewis Smith
    Nov 21 at 9:51






  • 1




    @LewisSmith not really, text processing is very much on topic here in practice. Just look at the numerous posts on AU on the topic.
    – Jacob Vlijm
    Nov 21 at 10:10






  • 1




    Are the lines/numbers sorted in the file?
    – Jacob Vlijm
    Nov 21 at 10:11










  • @JacobVlijm - My apologies. In that case ignore me.
    – Lewis Smith
    Nov 21 at 10:22






  • 1




    Agreed, @EliShain please clarify if you are looking for a strictly python based solution, or if another language would be acceptable to you.
    – Jacob Vlijm
    Nov 21 at 13:35


















  • Hi! This is a question for stack overflow. People here only really answer questions about Ubuntu directly.
    – Lewis Smith
    Nov 21 at 9:51






  • 1




    @LewisSmith not really, text processing is very much on topic here in practice. Just look at the numerous posts on AU on the topic.
    – Jacob Vlijm
    Nov 21 at 10:10






  • 1




    Are the lines/numbers sorted in the file?
    – Jacob Vlijm
    Nov 21 at 10:11










  • @JacobVlijm - My apologies. In that case ignore me.
    – Lewis Smith
    Nov 21 at 10:22






  • 1




    Agreed, @EliShain please clarify if you are looking for a strictly python based solution, or if another language would be acceptable to you.
    – Jacob Vlijm
    Nov 21 at 13:35
















Hi! This is a question for stack overflow. People here only really answer questions about Ubuntu directly.
– Lewis Smith
Nov 21 at 9:51




Hi! This is a question for stack overflow. People here only really answer questions about Ubuntu directly.
– Lewis Smith
Nov 21 at 9:51




1




1




@LewisSmith not really, text processing is very much on topic here in practice. Just look at the numerous posts on AU on the topic.
– Jacob Vlijm
Nov 21 at 10:10




@LewisSmith not really, text processing is very much on topic here in practice. Just look at the numerous posts on AU on the topic.
– Jacob Vlijm
Nov 21 at 10:10




1




1




Are the lines/numbers sorted in the file?
– Jacob Vlijm
Nov 21 at 10:11




Are the lines/numbers sorted in the file?
– Jacob Vlijm
Nov 21 at 10:11












@JacobVlijm - My apologies. In that case ignore me.
– Lewis Smith
Nov 21 at 10:22




@JacobVlijm - My apologies. In that case ignore me.
– Lewis Smith
Nov 21 at 10:22




1




1




Agreed, @EliShain please clarify if you are looking for a strictly python based solution, or if another language would be acceptable to you.
– Jacob Vlijm
Nov 21 at 13:35




Agreed, @EliShain please clarify if you are looking for a strictly python based solution, or if another language would be acceptable to you.
– Jacob Vlijm
Nov 21 at 13:35










2 Answers
2






active

oldest

votes

















up vote
3
down vote













Using awk:



$ awk -F- '{n = substr($2, 2)} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}' input-file
2
4




  • -F - sets the field separator to - (so S1, E1, etc. become different fields).

  • Then we extract the episode number (n = substr($2, 2)), by taking everything but the first character from the second field ($2).

  • If the episode number is not the previous episode + 1 ( (n - prev) != 1), we print all the numbers in between.

  • We save the current episode number in prev for the next iteration.


If the output isn't sorted, split up the extraction and check to insert a sort in between:



awk -F- '{print substr($2, 2)}' input-file | sort -n | awk '{n=$1} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}'





share|improve this answer























  • Mind that the lines aren't sorted (yet)
    – Jacob Vlijm
    Nov 22 at 7:42










  • Added a fix, thanks!
    – muru
    Nov 22 at 7:48


















up vote
1
down vote













A bit more straight forward script.
The script assumes the last episode exists and extracts its episode number. Then it iterates over [1..last] and check the existence of all episodes in between. Note this would not work for episodes numbered with leading zeroes.





#!/bin/bash

if [ -z "$1" ]; then
echo "please specify season prefix"
fi

# extract last episode number
last=`ls $1-*.* -1 --reverse | head -n 1 | grep --only-matching "E[[:digit:]+]" | cut -c 2-`

for ((i=1; i<=$last; i++)); do
if [ ! -f $1-E$i-*.* ]; then
echo "missing episode $i"
fi
done


The script takes the season prefix as its first argument, i.e. S1






share|improve this answer






























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote













    Using awk:



    $ awk -F- '{n = substr($2, 2)} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}' input-file
    2
    4




    • -F - sets the field separator to - (so S1, E1, etc. become different fields).

    • Then we extract the episode number (n = substr($2, 2)), by taking everything but the first character from the second field ($2).

    • If the episode number is not the previous episode + 1 ( (n - prev) != 1), we print all the numbers in between.

    • We save the current episode number in prev for the next iteration.


    If the output isn't sorted, split up the extraction and check to insert a sort in between:



    awk -F- '{print substr($2, 2)}' input-file | sort -n | awk '{n=$1} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}'





    share|improve this answer























    • Mind that the lines aren't sorted (yet)
      – Jacob Vlijm
      Nov 22 at 7:42










    • Added a fix, thanks!
      – muru
      Nov 22 at 7:48















    up vote
    3
    down vote













    Using awk:



    $ awk -F- '{n = substr($2, 2)} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}' input-file
    2
    4




    • -F - sets the field separator to - (so S1, E1, etc. become different fields).

    • Then we extract the episode number (n = substr($2, 2)), by taking everything but the first character from the second field ($2).

    • If the episode number is not the previous episode + 1 ( (n - prev) != 1), we print all the numbers in between.

    • We save the current episode number in prev for the next iteration.


    If the output isn't sorted, split up the extraction and check to insert a sort in between:



    awk -F- '{print substr($2, 2)}' input-file | sort -n | awk '{n=$1} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}'





    share|improve this answer























    • Mind that the lines aren't sorted (yet)
      – Jacob Vlijm
      Nov 22 at 7:42










    • Added a fix, thanks!
      – muru
      Nov 22 at 7:48













    up vote
    3
    down vote










    up vote
    3
    down vote









    Using awk:



    $ awk -F- '{n = substr($2, 2)} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}' input-file
    2
    4




    • -F - sets the field separator to - (so S1, E1, etc. become different fields).

    • Then we extract the episode number (n = substr($2, 2)), by taking everything but the first character from the second field ($2).

    • If the episode number is not the previous episode + 1 ( (n - prev) != 1), we print all the numbers in between.

    • We save the current episode number in prev for the next iteration.


    If the output isn't sorted, split up the extraction and check to insert a sort in between:



    awk -F- '{print substr($2, 2)}' input-file | sort -n | awk '{n=$1} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}'





    share|improve this answer














    Using awk:



    $ awk -F- '{n = substr($2, 2)} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}' input-file
    2
    4




    • -F - sets the field separator to - (so S1, E1, etc. become different fields).

    • Then we extract the episode number (n = substr($2, 2)), by taking everything but the first character from the second field ($2).

    • If the episode number is not the previous episode + 1 ( (n - prev) != 1), we print all the numbers in between.

    • We save the current episode number in prev for the next iteration.


    If the output isn't sorted, split up the extraction and check to insert a sort in between:



    awk -F- '{print substr($2, 2)}' input-file | sort -n | awk '{n=$1} (n - prev) != 1 {for (i = prev + 1; i < n; i++) print i} {prev = n}'






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 22 at 7:48

























    answered Nov 22 at 7:32









    muru

    134k19282482




    134k19282482












    • Mind that the lines aren't sorted (yet)
      – Jacob Vlijm
      Nov 22 at 7:42










    • Added a fix, thanks!
      – muru
      Nov 22 at 7:48


















    • Mind that the lines aren't sorted (yet)
      – Jacob Vlijm
      Nov 22 at 7:42










    • Added a fix, thanks!
      – muru
      Nov 22 at 7:48
















    Mind that the lines aren't sorted (yet)
    – Jacob Vlijm
    Nov 22 at 7:42




    Mind that the lines aren't sorted (yet)
    – Jacob Vlijm
    Nov 22 at 7:42












    Added a fix, thanks!
    – muru
    Nov 22 at 7:48




    Added a fix, thanks!
    – muru
    Nov 22 at 7:48












    up vote
    1
    down vote













    A bit more straight forward script.
    The script assumes the last episode exists and extracts its episode number. Then it iterates over [1..last] and check the existence of all episodes in between. Note this would not work for episodes numbered with leading zeroes.





    #!/bin/bash

    if [ -z "$1" ]; then
    echo "please specify season prefix"
    fi

    # extract last episode number
    last=`ls $1-*.* -1 --reverse | head -n 1 | grep --only-matching "E[[:digit:]+]" | cut -c 2-`

    for ((i=1; i<=$last; i++)); do
    if [ ! -f $1-E$i-*.* ]; then
    echo "missing episode $i"
    fi
    done


    The script takes the season prefix as its first argument, i.e. S1






    share|improve this answer



























      up vote
      1
      down vote













      A bit more straight forward script.
      The script assumes the last episode exists and extracts its episode number. Then it iterates over [1..last] and check the existence of all episodes in between. Note this would not work for episodes numbered with leading zeroes.





      #!/bin/bash

      if [ -z "$1" ]; then
      echo "please specify season prefix"
      fi

      # extract last episode number
      last=`ls $1-*.* -1 --reverse | head -n 1 | grep --only-matching "E[[:digit:]+]" | cut -c 2-`

      for ((i=1; i<=$last; i++)); do
      if [ ! -f $1-E$i-*.* ]; then
      echo "missing episode $i"
      fi
      done


      The script takes the season prefix as its first argument, i.e. S1






      share|improve this answer

























        up vote
        1
        down vote










        up vote
        1
        down vote









        A bit more straight forward script.
        The script assumes the last episode exists and extracts its episode number. Then it iterates over [1..last] and check the existence of all episodes in between. Note this would not work for episodes numbered with leading zeroes.





        #!/bin/bash

        if [ -z "$1" ]; then
        echo "please specify season prefix"
        fi

        # extract last episode number
        last=`ls $1-*.* -1 --reverse | head -n 1 | grep --only-matching "E[[:digit:]+]" | cut -c 2-`

        for ((i=1; i<=$last; i++)); do
        if [ ! -f $1-E$i-*.* ]; then
        echo "missing episode $i"
        fi
        done


        The script takes the season prefix as its first argument, i.e. S1






        share|improve this answer














        A bit more straight forward script.
        The script assumes the last episode exists and extracts its episode number. Then it iterates over [1..last] and check the existence of all episodes in between. Note this would not work for episodes numbered with leading zeroes.





        #!/bin/bash

        if [ -z "$1" ]; then
        echo "please specify season prefix"
        fi

        # extract last episode number
        last=`ls $1-*.* -1 --reverse | head -n 1 | grep --only-matching "E[[:digit:]+]" | cut -c 2-`

        for ((i=1; i<=$last; i++)); do
        if [ ! -f $1-E$i-*.* ]; then
        echo "missing episode $i"
        fi
        done


        The script takes the season prefix as its first argument, i.e. S1







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 22 at 8:35









        pa4080

        12.8k52460




        12.8k52460










        answered Nov 22 at 8:09









        Eli

        211




        211















            Popular posts from this blog

            flock() on closed filehandle LOCK_FILE at /usr/bin/apt-mirror

            Mangá

            Eduardo VII do Reino Unido