Notepad ++: How to group correclty so replacement works
I was given the replacement "n1=2". My job is to create a search pattern so that the replacement is successful.
My data goes like this:
There are Spanish translations for some names like Mary (Maria), John (Juan), Michael (Miguel) and Joseph (Jose).
My goal is to make it look like this:
There are Spanish translations for some names like
Mary = Maria
John = Juan
Michael = Miguel
Joseph = Jose.
The search I had so far was "are.|,.|and. + (^[a-z]S)+(.*S)". Even if the first part of the search works, it is not grouping correctly for when replacement comes. I don't know what to change from the search or how it should be changed to make it work.
windows notepad++
add a comment |
I was given the replacement "n1=2". My job is to create a search pattern so that the replacement is successful.
My data goes like this:
There are Spanish translations for some names like Mary (Maria), John (Juan), Michael (Miguel) and Joseph (Jose).
My goal is to make it look like this:
There are Spanish translations for some names like
Mary = Maria
John = Juan
Michael = Miguel
Joseph = Jose.
The search I had so far was "are.|,.|and. + (^[a-z]S)+(.*S)". Even if the first part of the search works, it is not grouping correctly for when replacement comes. I don't know what to change from the search or how it should be changed to make it work.
windows notepad++
add a comment |
I was given the replacement "n1=2". My job is to create a search pattern so that the replacement is successful.
My data goes like this:
There are Spanish translations for some names like Mary (Maria), John (Juan), Michael (Miguel) and Joseph (Jose).
My goal is to make it look like this:
There are Spanish translations for some names like
Mary = Maria
John = Juan
Michael = Miguel
Joseph = Jose.
The search I had so far was "are.|,.|and. + (^[a-z]S)+(.*S)". Even if the first part of the search works, it is not grouping correctly for when replacement comes. I don't know what to change from the search or how it should be changed to make it work.
windows notepad++
I was given the replacement "n1=2". My job is to create a search pattern so that the replacement is successful.
My data goes like this:
There are Spanish translations for some names like Mary (Maria), John (Juan), Michael (Miguel) and Joseph (Jose).
My goal is to make it look like this:
There are Spanish translations for some names like
Mary = Maria
John = Juan
Michael = Miguel
Joseph = Jose.
The search I had so far was "are.|,.|and. + (^[a-z]S)+(.*S)". Even if the first part of the search works, it is not grouping correctly for when replacement comes. I don't know what to change from the search or how it should be changed to make it work.
windows notepad++
windows notepad++
edited Jan 23 at 0:17
Scott
15.8k113990
15.8k113990
asked Jan 23 at 0:02
AngelAngel
1
1
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Ctrl+H
- Find what:
(w+)h+((w+))h*(?:,|and)?h*(.)?
- Replace with:
n$1 = $2$3
- check Wrap around
- check Regular expression
- Replace all
Explanation:
(w+) # group 1, 1 or more word characters, English name
h+ # 1 or more horizontal spaces
( # openning parenthesis
(w+) # group 2, 1 or more word characters, the Spanish name
) # closing parenthesis
h* # 0 or more horizontal spaces
(?: # non capture group
, # a comma
| # OR
and # literally and
)? # end group, optional
h* # 0 or mor horizontal spaces
(.)? # group 3, a dot, optional
Replacement:
n # linefeed, you can use rn for windows linebreak
$1 # content of group 1
= # space, equal sign, space
$2 # content of group 2
$3 # content of group 3
Result for given example:
There are Spanish translations for some names like
Mary = Maria
John = Juan
Michael = Miguel
Joseph = Jose.
Screen capture:
add a comment |
I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed
is pretty similar.
I believe that there’s no way you can get the exact result that you want
with the exact replacement string you showed,
because you want spaces before the Spanish names ( Maria
, Juan
, etc.),
but there are no such spaces present in the input text.
You’ll need to add at least one space to the replacement string.
The -r
option tells sed
to use “extended regular expressions”.
We don’t absolutely need that option, but
sed
(by default) uses(
and)
to match parentheses,
and(…)
to capture a group, while
sed -r
uses(
and)
to match literal parentheses,
and(…)
to capture a group.
You seem to be expecting to be able to use (…)
to capture a group,
so I’ll do this with -r
.
You can do this is a single substitution in sed
with
sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'
That can be broken down into
sed -r 's/ s ([A-Za-z]*) s ( ([A-Za-z]*) ) (,| and)? / n1 = 2 /g'
s/
— begin a substitute command.
s
— a space.
Insed
, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
Insed
you can also use[[:space:]]
.
Of course a space matches just a space,
buts
and[[:space:]]
match space or tab.
([A-Za-z]*)
— a capture group of any number of letters
(upper or lower case), to match the English version of the name.
Insed
you can also use[[:alpha:]]
(or[[:upper:]]
or[[:lower:]]
, as desired).
s
— another space.
(
— a literal left parenthesis
(the one before the Spanish version of the name).
([A-Za-z]*)
— same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.
)
— a literal right parenthesis
(the one after the Spanish version of the name).
(,| and)?
— a group that matches,
orand
, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to matchJoseph (Jose)
,
even though there’s not comma or “and
” after it.
Note that we could usesand
instead ofand
;
I believe thatand
is much more readable.
Also note that we could use*
(zero or more, with no limit)
instead of?
.
/
— end of search string; beginning of replacement string.
n1 = 2
— your replacement string
(newline, the first capture group,=
, and the second capture group).
As mentioned earlier, I have added spaces before and after the=
.
/g
— end of command.
Theg
stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).
So the Notepad++ command is probably very similar.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1397223%2fnotepad-how-to-group-correclty-so-replacement-works%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Ctrl+H
- Find what:
(w+)h+((w+))h*(?:,|and)?h*(.)?
- Replace with:
n$1 = $2$3
- check Wrap around
- check Regular expression
- Replace all
Explanation:
(w+) # group 1, 1 or more word characters, English name
h+ # 1 or more horizontal spaces
( # openning parenthesis
(w+) # group 2, 1 or more word characters, the Spanish name
) # closing parenthesis
h* # 0 or more horizontal spaces
(?: # non capture group
, # a comma
| # OR
and # literally and
)? # end group, optional
h* # 0 or mor horizontal spaces
(.)? # group 3, a dot, optional
Replacement:
n # linefeed, you can use rn for windows linebreak
$1 # content of group 1
= # space, equal sign, space
$2 # content of group 2
$3 # content of group 3
Result for given example:
There are Spanish translations for some names like
Mary = Maria
John = Juan
Michael = Miguel
Joseph = Jose.
Screen capture:
add a comment |
Ctrl+H
- Find what:
(w+)h+((w+))h*(?:,|and)?h*(.)?
- Replace with:
n$1 = $2$3
- check Wrap around
- check Regular expression
- Replace all
Explanation:
(w+) # group 1, 1 or more word characters, English name
h+ # 1 or more horizontal spaces
( # openning parenthesis
(w+) # group 2, 1 or more word characters, the Spanish name
) # closing parenthesis
h* # 0 or more horizontal spaces
(?: # non capture group
, # a comma
| # OR
and # literally and
)? # end group, optional
h* # 0 or mor horizontal spaces
(.)? # group 3, a dot, optional
Replacement:
n # linefeed, you can use rn for windows linebreak
$1 # content of group 1
= # space, equal sign, space
$2 # content of group 2
$3 # content of group 3
Result for given example:
There are Spanish translations for some names like
Mary = Maria
John = Juan
Michael = Miguel
Joseph = Jose.
Screen capture:
add a comment |
Ctrl+H
- Find what:
(w+)h+((w+))h*(?:,|and)?h*(.)?
- Replace with:
n$1 = $2$3
- check Wrap around
- check Regular expression
- Replace all
Explanation:
(w+) # group 1, 1 or more word characters, English name
h+ # 1 or more horizontal spaces
( # openning parenthesis
(w+) # group 2, 1 or more word characters, the Spanish name
) # closing parenthesis
h* # 0 or more horizontal spaces
(?: # non capture group
, # a comma
| # OR
and # literally and
)? # end group, optional
h* # 0 or mor horizontal spaces
(.)? # group 3, a dot, optional
Replacement:
n # linefeed, you can use rn for windows linebreak
$1 # content of group 1
= # space, equal sign, space
$2 # content of group 2
$3 # content of group 3
Result for given example:
There are Spanish translations for some names like
Mary = Maria
John = Juan
Michael = Miguel
Joseph = Jose.
Screen capture:
Ctrl+H
- Find what:
(w+)h+((w+))h*(?:,|and)?h*(.)?
- Replace with:
n$1 = $2$3
- check Wrap around
- check Regular expression
- Replace all
Explanation:
(w+) # group 1, 1 or more word characters, English name
h+ # 1 or more horizontal spaces
( # openning parenthesis
(w+) # group 2, 1 or more word characters, the Spanish name
) # closing parenthesis
h* # 0 or more horizontal spaces
(?: # non capture group
, # a comma
| # OR
and # literally and
)? # end group, optional
h* # 0 or mor horizontal spaces
(.)? # group 3, a dot, optional
Replacement:
n # linefeed, you can use rn for windows linebreak
$1 # content of group 1
= # space, equal sign, space
$2 # content of group 2
$3 # content of group 3
Result for given example:
There are Spanish translations for some names like
Mary = Maria
John = Juan
Michael = Miguel
Joseph = Jose.
Screen capture:
answered Jan 23 at 9:20
TotoToto
3,798101226
3,798101226
add a comment |
add a comment |
I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed
is pretty similar.
I believe that there’s no way you can get the exact result that you want
with the exact replacement string you showed,
because you want spaces before the Spanish names ( Maria
, Juan
, etc.),
but there are no such spaces present in the input text.
You’ll need to add at least one space to the replacement string.
The -r
option tells sed
to use “extended regular expressions”.
We don’t absolutely need that option, but
sed
(by default) uses(
and)
to match parentheses,
and(…)
to capture a group, while
sed -r
uses(
and)
to match literal parentheses,
and(…)
to capture a group.
You seem to be expecting to be able to use (…)
to capture a group,
so I’ll do this with -r
.
You can do this is a single substitution in sed
with
sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'
That can be broken down into
sed -r 's/ s ([A-Za-z]*) s ( ([A-Za-z]*) ) (,| and)? / n1 = 2 /g'
s/
— begin a substitute command.
s
— a space.
Insed
, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
Insed
you can also use[[:space:]]
.
Of course a space matches just a space,
buts
and[[:space:]]
match space or tab.
([A-Za-z]*)
— a capture group of any number of letters
(upper or lower case), to match the English version of the name.
Insed
you can also use[[:alpha:]]
(or[[:upper:]]
or[[:lower:]]
, as desired).
s
— another space.
(
— a literal left parenthesis
(the one before the Spanish version of the name).
([A-Za-z]*)
— same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.
)
— a literal right parenthesis
(the one after the Spanish version of the name).
(,| and)?
— a group that matches,
orand
, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to matchJoseph (Jose)
,
even though there’s not comma or “and
” after it.
Note that we could usesand
instead ofand
;
I believe thatand
is much more readable.
Also note that we could use*
(zero or more, with no limit)
instead of?
.
/
— end of search string; beginning of replacement string.
n1 = 2
— your replacement string
(newline, the first capture group,=
, and the second capture group).
As mentioned earlier, I have added spaces before and after the=
.
/g
— end of command.
Theg
stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).
So the Notepad++ command is probably very similar.
add a comment |
I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed
is pretty similar.
I believe that there’s no way you can get the exact result that you want
with the exact replacement string you showed,
because you want spaces before the Spanish names ( Maria
, Juan
, etc.),
but there are no such spaces present in the input text.
You’ll need to add at least one space to the replacement string.
The -r
option tells sed
to use “extended regular expressions”.
We don’t absolutely need that option, but
sed
(by default) uses(
and)
to match parentheses,
and(…)
to capture a group, while
sed -r
uses(
and)
to match literal parentheses,
and(…)
to capture a group.
You seem to be expecting to be able to use (…)
to capture a group,
so I’ll do this with -r
.
You can do this is a single substitution in sed
with
sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'
That can be broken down into
sed -r 's/ s ([A-Za-z]*) s ( ([A-Za-z]*) ) (,| and)? / n1 = 2 /g'
s/
— begin a substitute command.
s
— a space.
Insed
, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
Insed
you can also use[[:space:]]
.
Of course a space matches just a space,
buts
and[[:space:]]
match space or tab.
([A-Za-z]*)
— a capture group of any number of letters
(upper or lower case), to match the English version of the name.
Insed
you can also use[[:alpha:]]
(or[[:upper:]]
or[[:lower:]]
, as desired).
s
— another space.
(
— a literal left parenthesis
(the one before the Spanish version of the name).
([A-Za-z]*)
— same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.
)
— a literal right parenthesis
(the one after the Spanish version of the name).
(,| and)?
— a group that matches,
orand
, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to matchJoseph (Jose)
,
even though there’s not comma or “and
” after it.
Note that we could usesand
instead ofand
;
I believe thatand
is much more readable.
Also note that we could use*
(zero or more, with no limit)
instead of?
.
/
— end of search string; beginning of replacement string.
n1 = 2
— your replacement string
(newline, the first capture group,=
, and the second capture group).
As mentioned earlier, I have added spaces before and after the=
.
/g
— end of command.
Theg
stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).
So the Notepad++ command is probably very similar.
add a comment |
I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed
is pretty similar.
I believe that there’s no way you can get the exact result that you want
with the exact replacement string you showed,
because you want spaces before the Spanish names ( Maria
, Juan
, etc.),
but there are no such spaces present in the input text.
You’ll need to add at least one space to the replacement string.
The -r
option tells sed
to use “extended regular expressions”.
We don’t absolutely need that option, but
sed
(by default) uses(
and)
to match parentheses,
and(…)
to capture a group, while
sed -r
uses(
and)
to match literal parentheses,
and(…)
to capture a group.
You seem to be expecting to be able to use (…)
to capture a group,
so I’ll do this with -r
.
You can do this is a single substitution in sed
with
sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'
That can be broken down into
sed -r 's/ s ([A-Za-z]*) s ( ([A-Za-z]*) ) (,| and)? / n1 = 2 /g'
s/
— begin a substitute command.
s
— a space.
Insed
, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
Insed
you can also use[[:space:]]
.
Of course a space matches just a space,
buts
and[[:space:]]
match space or tab.
([A-Za-z]*)
— a capture group of any number of letters
(upper or lower case), to match the English version of the name.
Insed
you can also use[[:alpha:]]
(or[[:upper:]]
or[[:lower:]]
, as desired).
s
— another space.
(
— a literal left parenthesis
(the one before the Spanish version of the name).
([A-Za-z]*)
— same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.
)
— a literal right parenthesis
(the one after the Spanish version of the name).
(,| and)?
— a group that matches,
orand
, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to matchJoseph (Jose)
,
even though there’s not comma or “and
” after it.
Note that we could usesand
instead ofand
;
I believe thatand
is much more readable.
Also note that we could use*
(zero or more, with no limit)
instead of?
.
/
— end of search string; beginning of replacement string.
n1 = 2
— your replacement string
(newline, the first capture group,=
, and the second capture group).
As mentioned earlier, I have added spaces before and after the=
.
/g
— end of command.
Theg
stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).
So the Notepad++ command is probably very similar.
I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed
is pretty similar.
I believe that there’s no way you can get the exact result that you want
with the exact replacement string you showed,
because you want spaces before the Spanish names ( Maria
, Juan
, etc.),
but there are no such spaces present in the input text.
You’ll need to add at least one space to the replacement string.
The -r
option tells sed
to use “extended regular expressions”.
We don’t absolutely need that option, but
sed
(by default) uses(
and)
to match parentheses,
and(…)
to capture a group, while
sed -r
uses(
and)
to match literal parentheses,
and(…)
to capture a group.
You seem to be expecting to be able to use (…)
to capture a group,
so I’ll do this with -r
.
You can do this is a single substitution in sed
with
sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'
That can be broken down into
sed -r 's/ s ([A-Za-z]*) s ( ([A-Za-z]*) ) (,| and)? / n1 = 2 /g'
s/
— begin a substitute command.
s
— a space.
Insed
, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
Insed
you can also use[[:space:]]
.
Of course a space matches just a space,
buts
and[[:space:]]
match space or tab.
([A-Za-z]*)
— a capture group of any number of letters
(upper or lower case), to match the English version of the name.
Insed
you can also use[[:alpha:]]
(or[[:upper:]]
or[[:lower:]]
, as desired).
s
— another space.
(
— a literal left parenthesis
(the one before the Spanish version of the name).
([A-Za-z]*)
— same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.
)
— a literal right parenthesis
(the one after the Spanish version of the name).
(,| and)?
— a group that matches,
orand
, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to matchJoseph (Jose)
,
even though there’s not comma or “and
” after it.
Note that we could usesand
instead ofand
;
I believe thatand
is much more readable.
Also note that we could use*
(zero or more, with no limit)
instead of?
.
/
— end of search string; beginning of replacement string.
n1 = 2
— your replacement string
(newline, the first capture group,=
, and the second capture group).
As mentioned earlier, I have added spaces before and after the=
.
/g
— end of command.
Theg
stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).
So the Notepad++ command is probably very similar.
answered Jan 23 at 8:01
ScottScott
15.8k113990
15.8k113990
add a comment |
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1397223%2fnotepad-how-to-group-correclty-so-replacement-works%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown