Get Non matching string from file1 to file2
I have two files - file1 & file2.
file1
contains (only words):
ABC
YUI
GHJ
I8O
file2
contains many paragraphs:
dfghjo ABC kll
njjgg bla bla GHJ
njhjckhv chasjvackvh ..
ihbjhi hbhibb jh jbiibi
I am using the command below to get the matching lines which contains word from file1
in file2
:
grep -Ff file1 file2
(Gives output of lines where words of file1
found in file2
)
I also need the words from file1
which are not found in file2
.
Can anyone help in getting this output:
YUI
I8O
I am looking for a one liner command (via grep
, awk
, sed
), as I am using pssh
command and can't use while
or for
loops.
bash text-processing grep sed awk
add a comment |
I have two files - file1 & file2.
file1
contains (only words):
ABC
YUI
GHJ
I8O
file2
contains many paragraphs:
dfghjo ABC kll
njjgg bla bla GHJ
njhjckhv chasjvackvh ..
ihbjhi hbhibb jh jbiibi
I am using the command below to get the matching lines which contains word from file1
in file2
:
grep -Ff file1 file2
(Gives output of lines where words of file1
found in file2
)
I also need the words from file1
which are not found in file2
.
Can anyone help in getting this output:
YUI
I8O
I am looking for a one liner command (via grep
, awk
, sed
), as I am using pssh
command and can't use while
or for
loops.
bash text-processing grep sed awk
add a comment |
I have two files - file1 & file2.
file1
contains (only words):
ABC
YUI
GHJ
I8O
file2
contains many paragraphs:
dfghjo ABC kll
njjgg bla bla GHJ
njhjckhv chasjvackvh ..
ihbjhi hbhibb jh jbiibi
I am using the command below to get the matching lines which contains word from file1
in file2
:
grep -Ff file1 file2
(Gives output of lines where words of file1
found in file2
)
I also need the words from file1
which are not found in file2
.
Can anyone help in getting this output:
YUI
I8O
I am looking for a one liner command (via grep
, awk
, sed
), as I am using pssh
command and can't use while
or for
loops.
bash text-processing grep sed awk
I have two files - file1 & file2.
file1
contains (only words):
ABC
YUI
GHJ
I8O
file2
contains many paragraphs:
dfghjo ABC kll
njjgg bla bla GHJ
njhjckhv chasjvackvh ..
ihbjhi hbhibb jh jbiibi
I am using the command below to get the matching lines which contains word from file1
in file2
:
grep -Ff file1 file2
(Gives output of lines where words of file1
found in file2
)
I also need the words from file1
which are not found in file2
.
Can anyone help in getting this output:
YUI
I8O
I am looking for a one liner command (via grep
, awk
, sed
), as I am using pssh
command and can't use while
or for
loops.
bash text-processing grep sed awk
bash text-processing grep sed awk
edited Feb 19 at 9:23
terdon♦
67k13139221
67k13139221
asked Feb 19 at 9:20
Sin15Sin15
62
62
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Here's one way in awk
:
$ awk 'NR==FNR{a[$1]++; next}{for(i in a){if($0 ~ i){found[i]++}}}END{for(i in a){if(!found[i]){print i}}}' file1 file2
YUI
I8O
Or, a bit more legibly:
$ awk 'NR==FNR{
a[$1]++;
next
}
{
for(i in a){
if($0 ~ i){
found[i]++
}
}
}
END{
for(i in a){
if(!found[i]){
print i
}
}
}' file1 file2
YUI
I8O
Explanation
NR==FNR
:NR
is the current line number andFNR
is the current line number of the current file. When processing multiple files, the two will be equal only while reading the first file. So this is an easy way of saying "do this for the 1st file only".
a[$1]++; next
: while reading the first file, save each word (the first and only field) in the arraya
and then skip to the next line. Thenext
also ensures that the rest of the command is not run for the first file.
for(i in a){ if($0 ~ i){ found[i]++ } }
: For each of the words found in the first file (the keys of arraya
), check if the current line matches that word. If it does, save the word in thefound
array. This is run for each line of the second input file.
END{ }
: do this after you've processed all input files.
for(i in a){ if(!found[i]){ print i } }
: for each of the words ina
, if the word is not also in thefound
array, print that word.
Alternatively, you can use some of the core Linux utilities:
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Explanation
$ grep -hoP 'w+' file1 file2
ABC
YUI
GHJ
I8O
dfghjo
ABC
kll
njjgg
bla
bla
GHJ
njhjckhv
chasjvackvh
ihbjhi
hbhibb
jh
jbiibi
This will print all the words found in each file. The -o
flag means "only print the matching portion of the line", the -P
enables Perl Compatible Regular Expressions (PCRE) which let us use w
to mean "any word character" (so letters, numbers, _
).
$ grep -hoP 'w+' file1 file2 | sort | uniq -u
chasjvackvh
dfghjo
hbhibb
I8O
ihbjhi
jbiibi
jh
kll
njhjckhv
njjgg
YUI
Now we pass the output of the previous command through sort
and uniq -u
to keep only unique matches: these are the words that are only present in one of the two files.
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Finally, we feed this list of unique words to xargs
and have it grep
each of them in file1
. Only those unique words that are present in file1
will be returned, and unique words present in file1
are therefore not present in file2
.
Super, Its working . Thank a lot @terdon
– Sin15
Feb 19 at 9:35
2
@Sin15 you're welcome. Have a look at the updated answer, I added a much shorter and simpler version as well. Also, if this answer solved your issue, please take a moment to accept it by clicking on the checkmark on the left. That will mark the question as answered and is the way that thanks are conveyed on the Stack Exchange sites. Feel free to wait for more answers and accept another one, just remember to eventually accept one.
– terdon♦
Feb 19 at 9:43
add a comment |
try this command:
grep -oFf file1 file2 | grep -vFf - file1
where first use file1 as PATTERN and get only the part of a matching line that matches PATTERN in file2, first command give you:
ABC
GHJ
then use this output as input file PATTERN and search line in file1 that doesn't match PATTERN, and you will get:
YUI
I8O
Tested on Red Hat Enterprise Linux ES release 4 (Nahant Update 3)
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "89"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1119459%2fget-non-matching-string-from-file1-to-file2%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here's one way in awk
:
$ awk 'NR==FNR{a[$1]++; next}{for(i in a){if($0 ~ i){found[i]++}}}END{for(i in a){if(!found[i]){print i}}}' file1 file2
YUI
I8O
Or, a bit more legibly:
$ awk 'NR==FNR{
a[$1]++;
next
}
{
for(i in a){
if($0 ~ i){
found[i]++
}
}
}
END{
for(i in a){
if(!found[i]){
print i
}
}
}' file1 file2
YUI
I8O
Explanation
NR==FNR
:NR
is the current line number andFNR
is the current line number of the current file. When processing multiple files, the two will be equal only while reading the first file. So this is an easy way of saying "do this for the 1st file only".
a[$1]++; next
: while reading the first file, save each word (the first and only field) in the arraya
and then skip to the next line. Thenext
also ensures that the rest of the command is not run for the first file.
for(i in a){ if($0 ~ i){ found[i]++ } }
: For each of the words found in the first file (the keys of arraya
), check if the current line matches that word. If it does, save the word in thefound
array. This is run for each line of the second input file.
END{ }
: do this after you've processed all input files.
for(i in a){ if(!found[i]){ print i } }
: for each of the words ina
, if the word is not also in thefound
array, print that word.
Alternatively, you can use some of the core Linux utilities:
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Explanation
$ grep -hoP 'w+' file1 file2
ABC
YUI
GHJ
I8O
dfghjo
ABC
kll
njjgg
bla
bla
GHJ
njhjckhv
chasjvackvh
ihbjhi
hbhibb
jh
jbiibi
This will print all the words found in each file. The -o
flag means "only print the matching portion of the line", the -P
enables Perl Compatible Regular Expressions (PCRE) which let us use w
to mean "any word character" (so letters, numbers, _
).
$ grep -hoP 'w+' file1 file2 | sort | uniq -u
chasjvackvh
dfghjo
hbhibb
I8O
ihbjhi
jbiibi
jh
kll
njhjckhv
njjgg
YUI
Now we pass the output of the previous command through sort
and uniq -u
to keep only unique matches: these are the words that are only present in one of the two files.
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Finally, we feed this list of unique words to xargs
and have it grep
each of them in file1
. Only those unique words that are present in file1
will be returned, and unique words present in file1
are therefore not present in file2
.
Super, Its working . Thank a lot @terdon
– Sin15
Feb 19 at 9:35
2
@Sin15 you're welcome. Have a look at the updated answer, I added a much shorter and simpler version as well. Also, if this answer solved your issue, please take a moment to accept it by clicking on the checkmark on the left. That will mark the question as answered and is the way that thanks are conveyed on the Stack Exchange sites. Feel free to wait for more answers and accept another one, just remember to eventually accept one.
– terdon♦
Feb 19 at 9:43
add a comment |
Here's one way in awk
:
$ awk 'NR==FNR{a[$1]++; next}{for(i in a){if($0 ~ i){found[i]++}}}END{for(i in a){if(!found[i]){print i}}}' file1 file2
YUI
I8O
Or, a bit more legibly:
$ awk 'NR==FNR{
a[$1]++;
next
}
{
for(i in a){
if($0 ~ i){
found[i]++
}
}
}
END{
for(i in a){
if(!found[i]){
print i
}
}
}' file1 file2
YUI
I8O
Explanation
NR==FNR
:NR
is the current line number andFNR
is the current line number of the current file. When processing multiple files, the two will be equal only while reading the first file. So this is an easy way of saying "do this for the 1st file only".
a[$1]++; next
: while reading the first file, save each word (the first and only field) in the arraya
and then skip to the next line. Thenext
also ensures that the rest of the command is not run for the first file.
for(i in a){ if($0 ~ i){ found[i]++ } }
: For each of the words found in the first file (the keys of arraya
), check if the current line matches that word. If it does, save the word in thefound
array. This is run for each line of the second input file.
END{ }
: do this after you've processed all input files.
for(i in a){ if(!found[i]){ print i } }
: for each of the words ina
, if the word is not also in thefound
array, print that word.
Alternatively, you can use some of the core Linux utilities:
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Explanation
$ grep -hoP 'w+' file1 file2
ABC
YUI
GHJ
I8O
dfghjo
ABC
kll
njjgg
bla
bla
GHJ
njhjckhv
chasjvackvh
ihbjhi
hbhibb
jh
jbiibi
This will print all the words found in each file. The -o
flag means "only print the matching portion of the line", the -P
enables Perl Compatible Regular Expressions (PCRE) which let us use w
to mean "any word character" (so letters, numbers, _
).
$ grep -hoP 'w+' file1 file2 | sort | uniq -u
chasjvackvh
dfghjo
hbhibb
I8O
ihbjhi
jbiibi
jh
kll
njhjckhv
njjgg
YUI
Now we pass the output of the previous command through sort
and uniq -u
to keep only unique matches: these are the words that are only present in one of the two files.
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Finally, we feed this list of unique words to xargs
and have it grep
each of them in file1
. Only those unique words that are present in file1
will be returned, and unique words present in file1
are therefore not present in file2
.
Super, Its working . Thank a lot @terdon
– Sin15
Feb 19 at 9:35
2
@Sin15 you're welcome. Have a look at the updated answer, I added a much shorter and simpler version as well. Also, if this answer solved your issue, please take a moment to accept it by clicking on the checkmark on the left. That will mark the question as answered and is the way that thanks are conveyed on the Stack Exchange sites. Feel free to wait for more answers and accept another one, just remember to eventually accept one.
– terdon♦
Feb 19 at 9:43
add a comment |
Here's one way in awk
:
$ awk 'NR==FNR{a[$1]++; next}{for(i in a){if($0 ~ i){found[i]++}}}END{for(i in a){if(!found[i]){print i}}}' file1 file2
YUI
I8O
Or, a bit more legibly:
$ awk 'NR==FNR{
a[$1]++;
next
}
{
for(i in a){
if($0 ~ i){
found[i]++
}
}
}
END{
for(i in a){
if(!found[i]){
print i
}
}
}' file1 file2
YUI
I8O
Explanation
NR==FNR
:NR
is the current line number andFNR
is the current line number of the current file. When processing multiple files, the two will be equal only while reading the first file. So this is an easy way of saying "do this for the 1st file only".
a[$1]++; next
: while reading the first file, save each word (the first and only field) in the arraya
and then skip to the next line. Thenext
also ensures that the rest of the command is not run for the first file.
for(i in a){ if($0 ~ i){ found[i]++ } }
: For each of the words found in the first file (the keys of arraya
), check if the current line matches that word. If it does, save the word in thefound
array. This is run for each line of the second input file.
END{ }
: do this after you've processed all input files.
for(i in a){ if(!found[i]){ print i } }
: for each of the words ina
, if the word is not also in thefound
array, print that word.
Alternatively, you can use some of the core Linux utilities:
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Explanation
$ grep -hoP 'w+' file1 file2
ABC
YUI
GHJ
I8O
dfghjo
ABC
kll
njjgg
bla
bla
GHJ
njhjckhv
chasjvackvh
ihbjhi
hbhibb
jh
jbiibi
This will print all the words found in each file. The -o
flag means "only print the matching portion of the line", the -P
enables Perl Compatible Regular Expressions (PCRE) which let us use w
to mean "any word character" (so letters, numbers, _
).
$ grep -hoP 'w+' file1 file2 | sort | uniq -u
chasjvackvh
dfghjo
hbhibb
I8O
ihbjhi
jbiibi
jh
kll
njhjckhv
njjgg
YUI
Now we pass the output of the previous command through sort
and uniq -u
to keep only unique matches: these are the words that are only present in one of the two files.
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Finally, we feed this list of unique words to xargs
and have it grep
each of them in file1
. Only those unique words that are present in file1
will be returned, and unique words present in file1
are therefore not present in file2
.
Here's one way in awk
:
$ awk 'NR==FNR{a[$1]++; next}{for(i in a){if($0 ~ i){found[i]++}}}END{for(i in a){if(!found[i]){print i}}}' file1 file2
YUI
I8O
Or, a bit more legibly:
$ awk 'NR==FNR{
a[$1]++;
next
}
{
for(i in a){
if($0 ~ i){
found[i]++
}
}
}
END{
for(i in a){
if(!found[i]){
print i
}
}
}' file1 file2
YUI
I8O
Explanation
NR==FNR
:NR
is the current line number andFNR
is the current line number of the current file. When processing multiple files, the two will be equal only while reading the first file. So this is an easy way of saying "do this for the 1st file only".
a[$1]++; next
: while reading the first file, save each word (the first and only field) in the arraya
and then skip to the next line. Thenext
also ensures that the rest of the command is not run for the first file.
for(i in a){ if($0 ~ i){ found[i]++ } }
: For each of the words found in the first file (the keys of arraya
), check if the current line matches that word. If it does, save the word in thefound
array. This is run for each line of the second input file.
END{ }
: do this after you've processed all input files.
for(i in a){ if(!found[i]){ print i } }
: for each of the words ina
, if the word is not also in thefound
array, print that word.
Alternatively, you can use some of the core Linux utilities:
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Explanation
$ grep -hoP 'w+' file1 file2
ABC
YUI
GHJ
I8O
dfghjo
ABC
kll
njjgg
bla
bla
GHJ
njhjckhv
chasjvackvh
ihbjhi
hbhibb
jh
jbiibi
This will print all the words found in each file. The -o
flag means "only print the matching portion of the line", the -P
enables Perl Compatible Regular Expressions (PCRE) which let us use w
to mean "any word character" (so letters, numbers, _
).
$ grep -hoP 'w+' file1 file2 | sort | uniq -u
chasjvackvh
dfghjo
hbhibb
I8O
ihbjhi
jbiibi
jh
kll
njhjckhv
njjgg
YUI
Now we pass the output of the previous command through sort
and uniq -u
to keep only unique matches: these are the words that are only present in one of the two files.
$ grep -hoP 'w+' file1 file2 | sort | uniq -u | xargs -I{} grep {} file1
I8O
YUI
Finally, we feed this list of unique words to xargs
and have it grep
each of them in file1
. Only those unique words that are present in file1
will be returned, and unique words present in file1
are therefore not present in file2
.
edited Feb 19 at 9:42
answered Feb 19 at 9:32
terdon♦terdon
67k13139221
67k13139221
Super, Its working . Thank a lot @terdon
– Sin15
Feb 19 at 9:35
2
@Sin15 you're welcome. Have a look at the updated answer, I added a much shorter and simpler version as well. Also, if this answer solved your issue, please take a moment to accept it by clicking on the checkmark on the left. That will mark the question as answered and is the way that thanks are conveyed on the Stack Exchange sites. Feel free to wait for more answers and accept another one, just remember to eventually accept one.
– terdon♦
Feb 19 at 9:43
add a comment |
Super, Its working . Thank a lot @terdon
– Sin15
Feb 19 at 9:35
2
@Sin15 you're welcome. Have a look at the updated answer, I added a much shorter and simpler version as well. Also, if this answer solved your issue, please take a moment to accept it by clicking on the checkmark on the left. That will mark the question as answered and is the way that thanks are conveyed on the Stack Exchange sites. Feel free to wait for more answers and accept another one, just remember to eventually accept one.
– terdon♦
Feb 19 at 9:43
Super, Its working . Thank a lot @terdon
– Sin15
Feb 19 at 9:35
Super, Its working . Thank a lot @terdon
– Sin15
Feb 19 at 9:35
2
2
@Sin15 you're welcome. Have a look at the updated answer, I added a much shorter and simpler version as well. Also, if this answer solved your issue, please take a moment to accept it by clicking on the checkmark on the left. That will mark the question as answered and is the way that thanks are conveyed on the Stack Exchange sites. Feel free to wait for more answers and accept another one, just remember to eventually accept one.
– terdon♦
Feb 19 at 9:43
@Sin15 you're welcome. Have a look at the updated answer, I added a much shorter and simpler version as well. Also, if this answer solved your issue, please take a moment to accept it by clicking on the checkmark on the left. That will mark the question as answered and is the way that thanks are conveyed on the Stack Exchange sites. Feel free to wait for more answers and accept another one, just remember to eventually accept one.
– terdon♦
Feb 19 at 9:43
add a comment |
try this command:
grep -oFf file1 file2 | grep -vFf - file1
where first use file1 as PATTERN and get only the part of a matching line that matches PATTERN in file2, first command give you:
ABC
GHJ
then use this output as input file PATTERN and search line in file1 that doesn't match PATTERN, and you will get:
YUI
I8O
Tested on Red Hat Enterprise Linux ES release 4 (Nahant Update 3)
add a comment |
try this command:
grep -oFf file1 file2 | grep -vFf - file1
where first use file1 as PATTERN and get only the part of a matching line that matches PATTERN in file2, first command give you:
ABC
GHJ
then use this output as input file PATTERN and search line in file1 that doesn't match PATTERN, and you will get:
YUI
I8O
Tested on Red Hat Enterprise Linux ES release 4 (Nahant Update 3)
add a comment |
try this command:
grep -oFf file1 file2 | grep -vFf - file1
where first use file1 as PATTERN and get only the part of a matching line that matches PATTERN in file2, first command give you:
ABC
GHJ
then use this output as input file PATTERN and search line in file1 that doesn't match PATTERN, and you will get:
YUI
I8O
Tested on Red Hat Enterprise Linux ES release 4 (Nahant Update 3)
try this command:
grep -oFf file1 file2 | grep -vFf - file1
where first use file1 as PATTERN and get only the part of a matching line that matches PATTERN in file2, first command give you:
ABC
GHJ
then use this output as input file PATTERN and search line in file1 that doesn't match PATTERN, and you will get:
YUI
I8O
Tested on Red Hat Enterprise Linux ES release 4 (Nahant Update 3)
edited Feb 19 at 12:24
answered Feb 19 at 11:31
LetyLety
4,98521730
4,98521730
add a comment |
add a comment |
Thanks for contributing an answer to Ask Ubuntu!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1119459%2fget-non-matching-string-from-file1-to-file2%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown