Identifying an invisible character in plain text file
I am working with a plain text file that has invisible characters that I do not recognise. How can I identify them?
In Atom, they show as blanks when I toggle to show invisible characters. They do not show as a common space (the one that Atom shows as a small centered dot).
In BBEdit, it shows as a centered dot that looks slightly thicker than the common space. Replacing non-ASCII characters (with 'zap gremlins') does not replace it.
I can copy the character into a regular expression, and the query will find the character. It is not recognised as a white space character with s
.
I will copy the character here (between the arrows), but I have no idea if it actually shows up! -> <-
(wow, pasting an unknown invisible character felt absurdly awkward...)
regex characters
add a comment |
I am working with a plain text file that has invisible characters that I do not recognise. How can I identify them?
In Atom, they show as blanks when I toggle to show invisible characters. They do not show as a common space (the one that Atom shows as a small centered dot).
In BBEdit, it shows as a centered dot that looks slightly thicker than the common space. Replacing non-ASCII characters (with 'zap gremlins') does not replace it.
I can copy the character into a regular expression, and the query will find the character. It is not recognised as a white space character with s
.
I will copy the character here (between the arrows), but I have no idea if it actually shows up! -> <-
(wow, pasting an unknown invisible character felt absurdly awkward...)
regex characters
add a comment |
I am working with a plain text file that has invisible characters that I do not recognise. How can I identify them?
In Atom, they show as blanks when I toggle to show invisible characters. They do not show as a common space (the one that Atom shows as a small centered dot).
In BBEdit, it shows as a centered dot that looks slightly thicker than the common space. Replacing non-ASCII characters (with 'zap gremlins') does not replace it.
I can copy the character into a regular expression, and the query will find the character. It is not recognised as a white space character with s
.
I will copy the character here (between the arrows), but I have no idea if it actually shows up! -> <-
(wow, pasting an unknown invisible character felt absurdly awkward...)
regex characters
I am working with a plain text file that has invisible characters that I do not recognise. How can I identify them?
In Atom, they show as blanks when I toggle to show invisible characters. They do not show as a common space (the one that Atom shows as a small centered dot).
In BBEdit, it shows as a centered dot that looks slightly thicker than the common space. Replacing non-ASCII characters (with 'zap gremlins') does not replace it.
I can copy the character into a regular expression, and the query will find the character. It is not recognised as a white space character with s
.
I will copy the character here (between the arrows), but I have no idea if it actually shows up! -> <-
(wow, pasting an unknown invisible character felt absurdly awkward...)
regex characters
regex characters
edited Feb 12 at 14:55
Blackwood
2,88671728
2,88671728
asked Feb 12 at 11:40
MatthijsMatthijs
377
377
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Using a hex editor should reveal the hex codes you could then look up or search for.
If you wanted to stick with a (bash?) terminal, you could put the whole file through hexdump
/ hd
, or maybe grep
an offending line and just pipe it to hd
so you're only looking at one line, similar to:
grep "unique line text" file | hd
Or get only the Nth line withsed 'Nq;d file'
There's also the regular expression character class for all printable characters:
‘[:print:]’
Printable characters: ‘[:alnum:]’, ‘[:punct:]’, and space.
Searching for the inverse (-v
) of those might be useful, likegrep -v "[[:print:]]"
Or if you can copy it successfully, you could just paste it into a hex editor, or an echo " " | hd
command...
1
Thanks! Apparently it was a non-breaking space (c2 a0
). In my regexu00A0
selects the character.
– Matthijs
Feb 12 at 13:10
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1404811%2fidentifying-an-invisible-character-in-plain-text-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Using a hex editor should reveal the hex codes you could then look up or search for.
If you wanted to stick with a (bash?) terminal, you could put the whole file through hexdump
/ hd
, or maybe grep
an offending line and just pipe it to hd
so you're only looking at one line, similar to:
grep "unique line text" file | hd
Or get only the Nth line withsed 'Nq;d file'
There's also the regular expression character class for all printable characters:
‘[:print:]’
Printable characters: ‘[:alnum:]’, ‘[:punct:]’, and space.
Searching for the inverse (-v
) of those might be useful, likegrep -v "[[:print:]]"
Or if you can copy it successfully, you could just paste it into a hex editor, or an echo " " | hd
command...
1
Thanks! Apparently it was a non-breaking space (c2 a0
). In my regexu00A0
selects the character.
– Matthijs
Feb 12 at 13:10
add a comment |
Using a hex editor should reveal the hex codes you could then look up or search for.
If you wanted to stick with a (bash?) terminal, you could put the whole file through hexdump
/ hd
, or maybe grep
an offending line and just pipe it to hd
so you're only looking at one line, similar to:
grep "unique line text" file | hd
Or get only the Nth line withsed 'Nq;d file'
There's also the regular expression character class for all printable characters:
‘[:print:]’
Printable characters: ‘[:alnum:]’, ‘[:punct:]’, and space.
Searching for the inverse (-v
) of those might be useful, likegrep -v "[[:print:]]"
Or if you can copy it successfully, you could just paste it into a hex editor, or an echo " " | hd
command...
1
Thanks! Apparently it was a non-breaking space (c2 a0
). In my regexu00A0
selects the character.
– Matthijs
Feb 12 at 13:10
add a comment |
Using a hex editor should reveal the hex codes you could then look up or search for.
If you wanted to stick with a (bash?) terminal, you could put the whole file through hexdump
/ hd
, or maybe grep
an offending line and just pipe it to hd
so you're only looking at one line, similar to:
grep "unique line text" file | hd
Or get only the Nth line withsed 'Nq;d file'
There's also the regular expression character class for all printable characters:
‘[:print:]’
Printable characters: ‘[:alnum:]’, ‘[:punct:]’, and space.
Searching for the inverse (-v
) of those might be useful, likegrep -v "[[:print:]]"
Or if you can copy it successfully, you could just paste it into a hex editor, or an echo " " | hd
command...
Using a hex editor should reveal the hex codes you could then look up or search for.
If you wanted to stick with a (bash?) terminal, you could put the whole file through hexdump
/ hd
, or maybe grep
an offending line and just pipe it to hd
so you're only looking at one line, similar to:
grep "unique line text" file | hd
Or get only the Nth line withsed 'Nq;d file'
There's also the regular expression character class for all printable characters:
‘[:print:]’
Printable characters: ‘[:alnum:]’, ‘[:punct:]’, and space.
Searching for the inverse (-v
) of those might be useful, likegrep -v "[[:print:]]"
Or if you can copy it successfully, you could just paste it into a hex editor, or an echo " " | hd
command...
edited Feb 12 at 11:59
answered Feb 12 at 11:53
Xen2050Xen2050
11k31536
11k31536
1
Thanks! Apparently it was a non-breaking space (c2 a0
). In my regexu00A0
selects the character.
– Matthijs
Feb 12 at 13:10
add a comment |
1
Thanks! Apparently it was a non-breaking space (c2 a0
). In my regexu00A0
selects the character.
– Matthijs
Feb 12 at 13:10
1
1
Thanks! Apparently it was a non-breaking space (
c2 a0
). In my regex u00A0
selects the character.– Matthijs
Feb 12 at 13:10
Thanks! Apparently it was a non-breaking space (
c2 a0
). In my regex u00A0
selects the character.– Matthijs
Feb 12 at 13:10
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1404811%2fidentifying-an-invisible-character-in-plain-text-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown