Notepad ++: How to group correclty so replacement works

I was given the replacement "n1=2". My job is to create a search pattern so that the replacement is successful.

My data goes like this:

There are Spanish translations for some names like Mary (Maria), John (Juan), Michael (Miguel) and Joseph (Jose).

My goal is to make it look like this:

There are Spanish translations for some names like 

Mary = Maria

John = Juan

Michael = Miguel

Joseph = Jose.

The search I had so far was "are.|,.|and. + (^[a-z]S)+(.*S)". Even if the first part of the search works, it is not grouping correctly for when replacement comes. I don't know what to change from the search or how it should be changed to make it work.

edited Jan 23 at 0:17

Scott

15.8k113990

asked Jan 23 at 0:02

Angel

add a comment |

I was given the replacement "n1=2". My job is to create a search pattern so that the replacement is successful.

My data goes like this:

There are Spanish translations for some names like Mary (Maria), John (Juan), Michael (Miguel) and Joseph (Jose).

My goal is to make it look like this:

There are Spanish translations for some names like 

Mary = Maria

John = Juan

Michael = Miguel

Joseph = Jose.

edited Jan 23 at 0:17

Scott

15.8k113990

asked Jan 23 at 0:02

Angel

add a comment |

I was given the replacement "n1=2". My job is to create a search pattern so that the replacement is successful.

My data goes like this:

There are Spanish translations for some names like Mary (Maria), John (Juan), Michael (Miguel) and Joseph (Jose).

My goal is to make it look like this:

There are Spanish translations for some names like 

Mary = Maria

John = Juan

Michael = Miguel

Joseph = Jose.

edited Jan 23 at 0:17

Scott

15.8k113990

asked Jan 23 at 0:02

Angel

I was given the replacement "n1=2". My job is to create a search pattern so that the replacement is successful.

My data goes like this:

There are Spanish translations for some names like Mary (Maria), John (Juan), Michael (Miguel) and Joseph (Jose).

My goal is to make it look like this:

There are Spanish translations for some names like 

Mary = Maria

John = Juan

Michael = Miguel

Joseph = Jose.

windows notepad++

edited Jan 23 at 0:17

Scott

15.8k113990

asked Jan 23 at 0:02

Angel

edited Jan 23 at 0:17

Scott

15.8k113990

asked Jan 23 at 0:02

Angel

edited Jan 23 at 0:17

Scott

15.8k113990

edited Jan 23 at 0:17

Scott

15.8k113990

edited Jan 23 at 0:17

Scott

15.8k113990

asked Jan 23 at 0:02

Angel

asked Jan 23 at 0:02

Angel

asked Jan 23 at 0:02

Angel

add a comment |

2 Answers
2

active

oldest

votes

Ctrl+H

Find what: (w+)h+((w+))h*(?:,|and)?h*(.)?

Replace with: n$1 = $2$3

check Wrap around

check Regular expression

Replace all

Explanation:

(w+)       # group 1, 1 or more word characters, English name

h+         # 1 or more horizontal spaces

(          # openning parenthesis

(w+)       # group 2, 1 or more word characters, the Spanish name

)          # closing parenthesis

h*         # 0 or more horizontal spaces

(?:         # non capture group

    ,       # a comma

  |         # OR

    and     # literally and

)?          # end group, optional

h*         # 0 or mor horizontal spaces

(.)?       # group 3, a dot, optional

Replacement:

n          # linefeed, you can use rn for windows linebreak

$1          # content of group 1

 =          # space, equal sign, space

$2          # content of group 2

$3          # content of group 3

Result for given example:

There are Spanish translations for some names like 

Mary = Maria

John = Juan

Michael = Miguel

Joseph = Jose.

Screen capture:

enter image description here

answered Jan 23 at 9:20

Toto

3,798101226

add a comment |

I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed is pretty similar.

I believe that there’s no way you can get the exact result that you want
with the exact replacement string you showed,
because you want spaces before the Spanish names ( Maria, Juan, etc.),
but there are no such spaces present in the input text.
You’ll need to add at least one space to the replacement string.

The -r option tells sed to use “extended regular expressions”.
We don’t absolutely need that option, but

sed (by default) uses ( and ) to match parentheses,
and (…) to capture a group, while

sed -r uses ( and ) to match literal parentheses,
and (…) to capture a group.

You seem to be expecting to be able to use (…) to capture a group,
so I’ll do this with -r.

You can do this is a single substitution in sed with

sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'

That can be broken down into

sed -r 's/   s   ([A-Za-z]*)   s   (   ([A-Za-z]*)   )   (,| and)?   /   n1 = 2   /g'

s/ — begin a substitute command.

s — a space.
In sed, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
In sed you can also use [[:space:]].
Of course a space matches just a space,
but s and [[:space:]] match space or tab.

([A-Za-z]*) — a capture group of any number of letters
(upper or lower case), to match the English version of the name.
In sed you can also use [[:alpha:]]
(or [[:upper:]] or [[:lower:]], as desired).

s — another space.

( — a literal left parenthesis
(the one before the Spanish version of the name).

([A-Za-z]*) — same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.

) — a literal right parenthesis
(the one after the Spanish version of the name).

(,| and)? — a group that matches , or and, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to match Joseph (Jose),
even though there’s not comma or “and” after it.

Note that we could use sand instead of and;
I believe that and is much more readable.
Also note that we could use * (zero or more, with no limit)
instead of ?.

/ — end of search string; beginning of replacement string.

n1 = 2 — your replacement string
(newline, the first capture group, = , and the second capture group).
As mentioned earlier, I have added spaces before and after the =.

/g — end of command.
The g stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).

So the Notepad++ command is probably very similar.

answered Jan 23 at 8:01

Scott

15.8k113990

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1397223%2fnotepad-how-to-group-correclty-so-replacement-works%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Ctrl+H

Find what: (w+)h+((w+))h*(?:,|and)?h*(.)?

Replace with: n$1 = $2$3

check Wrap around

check Regular expression

Replace all

Explanation:

(w+)       # group 1, 1 or more word characters, English name

h+         # 1 or more horizontal spaces

(          # openning parenthesis

(w+)       # group 2, 1 or more word characters, the Spanish name

)          # closing parenthesis

h*         # 0 or more horizontal spaces

(?:         # non capture group

    ,       # a comma

  |         # OR

    and     # literally and

)?          # end group, optional

h*         # 0 or mor horizontal spaces

(.)?       # group 3, a dot, optional

Replacement:

n          # linefeed, you can use rn for windows linebreak

$1          # content of group 1

 =          # space, equal sign, space

$2          # content of group 2

$3          # content of group 3

Result for given example:

There are Spanish translations for some names like 

Mary = Maria

John = Juan

Michael = Miguel

Joseph = Jose.

Screen capture:

enter image description here

answered Jan 23 at 9:20

Toto

3,798101226

add a comment |

Ctrl+H

Find what: (w+)h+((w+))h*(?:,|and)?h*(.)?

Replace with: n$1 = $2$3

check Wrap around

check Regular expression

Replace all

Explanation:

(w+)       # group 1, 1 or more word characters, English name

h+         # 1 or more horizontal spaces

(          # openning parenthesis

(w+)       # group 2, 1 or more word characters, the Spanish name

)          # closing parenthesis

h*         # 0 or more horizontal spaces

(?:         # non capture group

    ,       # a comma

  |         # OR

    and     # literally and

)?          # end group, optional

h*         # 0 or mor horizontal spaces

(.)?       # group 3, a dot, optional

Replacement:

n          # linefeed, you can use rn for windows linebreak

$1          # content of group 1

 =          # space, equal sign, space

$2          # content of group 2

$3          # content of group 3

Result for given example:

There are Spanish translations for some names like 

Mary = Maria

John = Juan

Michael = Miguel

Joseph = Jose.

Screen capture:

enter image description here

answered Jan 23 at 9:20

Toto

3,798101226

add a comment |

Ctrl+H

Find what: (w+)h+((w+))h*(?:,|and)?h*(.)?

Replace with: n$1 = $2$3

check Wrap around

check Regular expression

Replace all

Explanation:

(w+)       # group 1, 1 or more word characters, English name

h+         # 1 or more horizontal spaces

(          # openning parenthesis

(w+)       # group 2, 1 or more word characters, the Spanish name

)          # closing parenthesis

h*         # 0 or more horizontal spaces

(?:         # non capture group

    ,       # a comma

  |         # OR

    and     # literally and

)?          # end group, optional

h*         # 0 or mor horizontal spaces

(.)?       # group 3, a dot, optional

Replacement:

n          # linefeed, you can use rn for windows linebreak

$1          # content of group 1

 =          # space, equal sign, space

$2          # content of group 2

$3          # content of group 3

Result for given example:

There are Spanish translations for some names like 

Mary = Maria

John = Juan

Michael = Miguel

Joseph = Jose.

Screen capture:

enter image description here

answered Jan 23 at 9:20

Toto

3,798101226

Ctrl+H

Find what: (w+)h+((w+))h*(?:,|and)?h*(.)?

Replace with: n$1 = $2$3

check Wrap around

check Regular expression

Replace all

Explanation:

(w+)       # group 1, 1 or more word characters, English name

h+         # 1 or more horizontal spaces

(          # openning parenthesis

(w+)       # group 2, 1 or more word characters, the Spanish name

)          # closing parenthesis

h*         # 0 or more horizontal spaces

(?:         # non capture group

    ,       # a comma

  |         # OR

    and     # literally and

)?          # end group, optional

h*         # 0 or mor horizontal spaces

(.)?       # group 3, a dot, optional

Replacement:

n          # linefeed, you can use rn for windows linebreak

$1          # content of group 1

 =          # space, equal sign, space

$2          # content of group 2

$3          # content of group 3

Result for given example:

There are Spanish translations for some names like 

Mary = Maria

John = Juan

Michael = Miguel

Joseph = Jose.

Screen capture:

enter image description here

answered Jan 23 at 9:20

Toto

3,798101226

answered Jan 23 at 9:20

Toto

3,798101226

answered Jan 23 at 9:20

Toto

3,798101226

answered Jan 23 at 9:20

Toto

3,798101226

add a comment |

I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed is pretty similar.

The -r option tells sed to use “extended regular expressions”.
We don’t absolutely need that option, but

sed (by default) uses ( and ) to match parentheses,
and (…) to capture a group, while

sed -r uses ( and ) to match literal parentheses,
and (…) to capture a group.

You seem to be expecting to be able to use (…) to capture a group,
so I’ll do this with -r.

You can do this is a single substitution in sed with

sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'

That can be broken down into

sed -r 's/   s   ([A-Za-z]*)   s   (   ([A-Za-z]*)   )   (,| and)?   /   n1 = 2   /g'

s/ — begin a substitute command.

s — a space.
In sed, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
In sed you can also use [[:space:]].
Of course a space matches just a space,
but s and [[:space:]] match space or tab.

([A-Za-z]*) — a capture group of any number of letters
(upper or lower case), to match the English version of the name.
In sed you can also use [[:alpha:]]
(or [[:upper:]] or [[:lower:]], as desired).

s — another space.

( — a literal left parenthesis
(the one before the Spanish version of the name).

([A-Za-z]*) — same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.

) — a literal right parenthesis
(the one after the Spanish version of the name).

(,| and)? — a group that matches , or and, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to match Joseph (Jose),
even though there’s not comma or “and” after it.

Note that we could use sand instead of and;
I believe that and is much more readable.
Also note that we could use * (zero or more, with no limit)
instead of ?.

/ — end of search string; beginning of replacement string.

n1 = 2 — your replacement string
(newline, the first capture group, = , and the second capture group).
As mentioned earlier, I have added spaces before and after the =.

/g — end of command.
The g stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).

So the Notepad++ command is probably very similar.

answered Jan 23 at 8:01

Scott

15.8k113990

add a comment |

I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed is pretty similar.

The -r option tells sed to use “extended regular expressions”.
We don’t absolutely need that option, but

sed (by default) uses ( and ) to match parentheses,
and (…) to capture a group, while

sed -r uses ( and ) to match literal parentheses,
and (…) to capture a group.

You seem to be expecting to be able to use (…) to capture a group,
so I’ll do this with -r.

You can do this is a single substitution in sed with

sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'

That can be broken down into

sed -r 's/   s   ([A-Za-z]*)   s   (   ([A-Za-z]*)   )   (,| and)?   /   n1 = 2   /g'

s/ — begin a substitute command.

s — a space.
In sed, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
In sed you can also use [[:space:]].
Of course a space matches just a space,
but s and [[:space:]] match space or tab.

([A-Za-z]*) — a capture group of any number of letters
(upper or lower case), to match the English version of the name.
In sed you can also use [[:alpha:]]
(or [[:upper:]] or [[:lower:]], as desired).

s — another space.

( — a literal left parenthesis
(the one before the Spanish version of the name).

([A-Za-z]*) — same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.

) — a literal right parenthesis
(the one after the Spanish version of the name).

(,| and)? — a group that matches , or and, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to match Joseph (Jose),
even though there’s not comma or “and” after it.

Note that we could use sand instead of and;
I believe that and is much more readable.
Also note that we could use * (zero or more, with no limit)
instead of ?.

/ — end of search string; beginning of replacement string.

n1 = 2 — your replacement string
(newline, the first capture group, = , and the second capture group).
As mentioned earlier, I have added spaces before and after the =.

/g — end of command.
The g stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).

So the Notepad++ command is probably very similar.

answered Jan 23 at 8:01

Scott

15.8k113990

add a comment |

I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed is pretty similar.

The -r option tells sed to use “extended regular expressions”.
We don’t absolutely need that option, but

sed (by default) uses ( and ) to match parentheses,
and (…) to capture a group, while

sed -r uses ( and ) to match literal parentheses,
and (…) to capture a group.

You seem to be expecting to be able to use (…) to capture a group,
so I’ll do this with -r.

You can do this is a single substitution in sed with

sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'

That can be broken down into

sed -r 's/   s   ([A-Za-z]*)   s   (   ([A-Za-z]*)   )   (,| and)?   /   n1 = 2   /g'

s/ — begin a substitute command.

s — a space.
In sed, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
In sed you can also use [[:space:]].
Of course a space matches just a space,
but s and [[:space:]] match space or tab.

([A-Za-z]*) — a capture group of any number of letters
(upper or lower case), to match the English version of the name.
In sed you can also use [[:alpha:]]
(or [[:upper:]] or [[:lower:]], as desired).

s — another space.

( — a literal left parenthesis
(the one before the Spanish version of the name).

([A-Za-z]*) — same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.

) — a literal right parenthesis
(the one after the Spanish version of the name).

(,| and)? — a group that matches , or and, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to match Joseph (Jose),
even though there’s not comma or “and” after it.

Note that we could use sand instead of and;
I believe that and is much more readable.
Also note that we could use * (zero or more, with no limit)
instead of ?.

/ — end of search string; beginning of replacement string.

n1 = 2 — your replacement string
(newline, the first capture group, = , and the second capture group).
As mentioned earlier, I have added spaces before and after the =.

/g — end of command.
The g stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).

So the Notepad++ command is probably very similar.

answered Jan 23 at 8:01

Scott

15.8k113990

I’m not specifically familiar
with the search-and-replace capability of Notepad++,
but Unix’s sed is pretty similar.

The -r option tells sed to use “extended regular expressions”.
We don’t absolutely need that option, but

sed (by default) uses ( and ) to match parentheses,
and (…) to capture a group, while

sed -r uses ( and ) to match literal parentheses,
and (…) to capture a group.

You seem to be expecting to be able to use (…) to capture a group,
so I’ll do this with -r.

You can do this is a single substitution in sed with

sed -r 's/s([A-Za-z]*)s(([A-Za-z]*))(,| and)*/n1 = 2/g'

That can be broken down into

sed -r 's/   s   ([A-Za-z]*)   s   (   ([A-Za-z]*)   )   (,| and)?   /   n1 = 2   /g'

s/ — begin a substitute command.

s — a space.
In sed, you can use actual space characters;
I suspect that that’s true for Notepad++ as well.
In sed you can also use [[:space:]].
Of course a space matches just a space,
but s and [[:space:]] match space or tab.

([A-Za-z]*) — a capture group of any number of letters
(upper or lower case), to match the English version of the name.
In sed you can also use [[:alpha:]]
(or [[:upper:]] or [[:lower:]], as desired).

s — another space.

( — a literal left parenthesis
(the one before the Spanish version of the name).

([A-Za-z]*) — same as above — a capture group of any number of letters
(upper or lower case), to match the Spanish version of the name.

) — a literal right parenthesis
(the one after the Spanish version of the name).

(,| and)? — a group that matches , or and, zero or one time.
This matches the stuff that comes between
the right parenthesis after the Spanish version of the name,
and the next English-version name.
We need to be able to handle zero occurrences of this group
because we need to match Joseph (Jose),
even though there’s not comma or “and” after it.

Note that we could use sand instead of and;
I believe that and is much more readable.
Also note that we could use * (zero or more, with no limit)
instead of ?.

/ — end of search string; beginning of replacement string.

n1 = 2 — your replacement string
(newline, the first capture group, = , and the second capture group).
As mentioned earlier, I have added spaces before and after the =.

/g — end of command.
The g stands for “global” and specifies that the substitution
should be performed as many times as possible (the default is once per line).

So the Notepad++ command is probably very similar.

answered Jan 23 at 8:01

Scott

15.8k113990

answered Jan 23 at 8:01

Scott

15.8k113990

answered Jan 23 at 8:01

Scott

15.8k113990

answered Jan 23 at 8:01

Scott

15.8k113990

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Super User!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtyktl