Regex PDF search?

Multi tool use
I am an electronics engineer and I regularly view PDF schematics. Often I encounter the scenario where I would like to search the schematic for a component, e.g. "R1"
The problem is that searching for "R1" matches all the R[tens] and R[hundreds] on the schematic as well. So I would like to be able to use a regex in my search, or at least have tighter control of the search (e.g. search whole word only).
Has anyone here found a good PDF tool on Ubuntu which supports these features?
pdf regex
add a comment |
I am an electronics engineer and I regularly view PDF schematics. Often I encounter the scenario where I would like to search the schematic for a component, e.g. "R1"
The problem is that searching for "R1" matches all the R[tens] and R[hundreds] on the schematic as well. So I would like to be able to use a regex in my search, or at least have tighter control of the search (e.g. search whole word only).
Has anyone here found a good PDF tool on Ubuntu which supports these features?
pdf regex
The question is not a duplicate because of the context of its use. My context requires a graphical PDF reader with regex search options so that I can quickly navigate to components of interest in a schematic. Cheers.
– Brian J Hoskins
Jan 8 '15 at 11:31
add a comment |
I am an electronics engineer and I regularly view PDF schematics. Often I encounter the scenario where I would like to search the schematic for a component, e.g. "R1"
The problem is that searching for "R1" matches all the R[tens] and R[hundreds] on the schematic as well. So I would like to be able to use a regex in my search, or at least have tighter control of the search (e.g. search whole word only).
Has anyone here found a good PDF tool on Ubuntu which supports these features?
pdf regex
I am an electronics engineer and I regularly view PDF schematics. Often I encounter the scenario where I would like to search the schematic for a component, e.g. "R1"
The problem is that searching for "R1" matches all the R[tens] and R[hundreds] on the schematic as well. So I would like to be able to use a regex in my search, or at least have tighter control of the search (e.g. search whole word only).
Has anyone here found a good PDF tool on Ubuntu which supports these features?
pdf regex
pdf regex
edited Jan 8 '15 at 10:54
αғsнιη
24.1k2295156
24.1k2295156
asked Jan 8 '15 at 10:12


Brian J Hoskins
7119
7119
The question is not a duplicate because of the context of its use. My context requires a graphical PDF reader with regex search options so that I can quickly navigate to components of interest in a schematic. Cheers.
– Brian J Hoskins
Jan 8 '15 at 11:31
add a comment |
The question is not a duplicate because of the context of its use. My context requires a graphical PDF reader with regex search options so that I can quickly navigate to components of interest in a schematic. Cheers.
– Brian J Hoskins
Jan 8 '15 at 11:31
The question is not a duplicate because of the context of its use. My context requires a graphical PDF reader with regex search options so that I can quickly navigate to components of interest in a schematic. Cheers.
– Brian J Hoskins
Jan 8 '15 at 11:31
The question is not a duplicate because of the context of its use. My context requires a graphical PDF reader with regex search options so that I can quickly navigate to components of interest in a schematic. Cheers.
– Brian J Hoskins
Jan 8 '15 at 11:31
add a comment |
3 Answers
3
active
oldest
votes
Install pdfgrep :
sudo apt-get install pdfgrep
And then use -C
option and word boundaries match:
pdfgrep -C 0 '<WORD>' file.pdf
or use b...b
instead of <...>
.
See its man pdfgrep
-C, --context NUM
Print at most NUM characters of context around each match.
I have googled and found JPedal(30-days trial). Download and open it via command-line by the following command:
java -jar jpedal-trial.jar
Now press Ctrl+F, type the word that you want to search and check the "Find Whole Words Only" from Down-arrow icon () to search for whole word.
1
Thanks. The problem with this solution is that an electronics schematic is mostly a graphical document with text identifiers. The purpose of my search is not to determine if the text exists in the document or not, but to take me quickly to the component of interest. So the search must be completed in a graphical environment.
– Brian J Hoskins
Jan 8 '15 at 11:28
@BrianJHoskins updated answer. check please.
– αғsнιη
Jan 10 '15 at 10:36
add a comment |
If you are fine with creating an index of your documents you could use Recoll which is a full-on desktop search engine. For screenshots and installation instructions please take a look at this answer.
Recoll searches are constructed using a poweful query language that supports wildcards and modifiers (e.g. proximity and slack).
For instance, the query "R1"l
would only yield whole-word results. This is because the l
modifier turns off stemming. (In this specific example you wouldn't even need the modifier because Recoll doesn't expand sequences of numbers by default).
add a comment |
If the problem is just to limit the search to whole words, that is easy enough. Just add spaces before and after your search string, like so: " R1 "
. I use this trick in Evince all the time.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "89"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f571280%2fregex-pdf-search%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Install pdfgrep :
sudo apt-get install pdfgrep
And then use -C
option and word boundaries match:
pdfgrep -C 0 '<WORD>' file.pdf
or use b...b
instead of <...>
.
See its man pdfgrep
-C, --context NUM
Print at most NUM characters of context around each match.
I have googled and found JPedal(30-days trial). Download and open it via command-line by the following command:
java -jar jpedal-trial.jar
Now press Ctrl+F, type the word that you want to search and check the "Find Whole Words Only" from Down-arrow icon () to search for whole word.
1
Thanks. The problem with this solution is that an electronics schematic is mostly a graphical document with text identifiers. The purpose of my search is not to determine if the text exists in the document or not, but to take me quickly to the component of interest. So the search must be completed in a graphical environment.
– Brian J Hoskins
Jan 8 '15 at 11:28
@BrianJHoskins updated answer. check please.
– αғsнιη
Jan 10 '15 at 10:36
add a comment |
Install pdfgrep :
sudo apt-get install pdfgrep
And then use -C
option and word boundaries match:
pdfgrep -C 0 '<WORD>' file.pdf
or use b...b
instead of <...>
.
See its man pdfgrep
-C, --context NUM
Print at most NUM characters of context around each match.
I have googled and found JPedal(30-days trial). Download and open it via command-line by the following command:
java -jar jpedal-trial.jar
Now press Ctrl+F, type the word that you want to search and check the "Find Whole Words Only" from Down-arrow icon () to search for whole word.
1
Thanks. The problem with this solution is that an electronics schematic is mostly a graphical document with text identifiers. The purpose of my search is not to determine if the text exists in the document or not, but to take me quickly to the component of interest. So the search must be completed in a graphical environment.
– Brian J Hoskins
Jan 8 '15 at 11:28
@BrianJHoskins updated answer. check please.
– αғsнιη
Jan 10 '15 at 10:36
add a comment |
Install pdfgrep :
sudo apt-get install pdfgrep
And then use -C
option and word boundaries match:
pdfgrep -C 0 '<WORD>' file.pdf
or use b...b
instead of <...>
.
See its man pdfgrep
-C, --context NUM
Print at most NUM characters of context around each match.
I have googled and found JPedal(30-days trial). Download and open it via command-line by the following command:
java -jar jpedal-trial.jar
Now press Ctrl+F, type the word that you want to search and check the "Find Whole Words Only" from Down-arrow icon () to search for whole word.
Install pdfgrep :
sudo apt-get install pdfgrep
And then use -C
option and word boundaries match:
pdfgrep -C 0 '<WORD>' file.pdf
or use b...b
instead of <...>
.
See its man pdfgrep
-C, --context NUM
Print at most NUM characters of context around each match.
I have googled and found JPedal(30-days trial). Download and open it via command-line by the following command:
java -jar jpedal-trial.jar
Now press Ctrl+F, type the word that you want to search and check the "Find Whole Words Only" from Down-arrow icon () to search for whole word.
edited Jan 10 '15 at 10:35
answered Jan 8 '15 at 10:44
αғsнιη
24.1k2295156
24.1k2295156
1
Thanks. The problem with this solution is that an electronics schematic is mostly a graphical document with text identifiers. The purpose of my search is not to determine if the text exists in the document or not, but to take me quickly to the component of interest. So the search must be completed in a graphical environment.
– Brian J Hoskins
Jan 8 '15 at 11:28
@BrianJHoskins updated answer. check please.
– αғsнιη
Jan 10 '15 at 10:36
add a comment |
1
Thanks. The problem with this solution is that an electronics schematic is mostly a graphical document with text identifiers. The purpose of my search is not to determine if the text exists in the document or not, but to take me quickly to the component of interest. So the search must be completed in a graphical environment.
– Brian J Hoskins
Jan 8 '15 at 11:28
@BrianJHoskins updated answer. check please.
– αғsнιη
Jan 10 '15 at 10:36
1
1
Thanks. The problem with this solution is that an electronics schematic is mostly a graphical document with text identifiers. The purpose of my search is not to determine if the text exists in the document or not, but to take me quickly to the component of interest. So the search must be completed in a graphical environment.
– Brian J Hoskins
Jan 8 '15 at 11:28
Thanks. The problem with this solution is that an electronics schematic is mostly a graphical document with text identifiers. The purpose of my search is not to determine if the text exists in the document or not, but to take me quickly to the component of interest. So the search must be completed in a graphical environment.
– Brian J Hoskins
Jan 8 '15 at 11:28
@BrianJHoskins updated answer. check please.
– αғsнιη
Jan 10 '15 at 10:36
@BrianJHoskins updated answer. check please.
– αғsнιη
Jan 10 '15 at 10:36
add a comment |
If you are fine with creating an index of your documents you could use Recoll which is a full-on desktop search engine. For screenshots and installation instructions please take a look at this answer.
Recoll searches are constructed using a poweful query language that supports wildcards and modifiers (e.g. proximity and slack).
For instance, the query "R1"l
would only yield whole-word results. This is because the l
modifier turns off stemming. (In this specific example you wouldn't even need the modifier because Recoll doesn't expand sequences of numbers by default).
add a comment |
If you are fine with creating an index of your documents you could use Recoll which is a full-on desktop search engine. For screenshots and installation instructions please take a look at this answer.
Recoll searches are constructed using a poweful query language that supports wildcards and modifiers (e.g. proximity and slack).
For instance, the query "R1"l
would only yield whole-word results. This is because the l
modifier turns off stemming. (In this specific example you wouldn't even need the modifier because Recoll doesn't expand sequences of numbers by default).
add a comment |
If you are fine with creating an index of your documents you could use Recoll which is a full-on desktop search engine. For screenshots and installation instructions please take a look at this answer.
Recoll searches are constructed using a poweful query language that supports wildcards and modifiers (e.g. proximity and slack).
For instance, the query "R1"l
would only yield whole-word results. This is because the l
modifier turns off stemming. (In this specific example you wouldn't even need the modifier because Recoll doesn't expand sequences of numbers by default).
If you are fine with creating an index of your documents you could use Recoll which is a full-on desktop search engine. For screenshots and installation instructions please take a look at this answer.
Recoll searches are constructed using a poweful query language that supports wildcards and modifiers (e.g. proximity and slack).
For instance, the query "R1"l
would only yield whole-word results. This is because the l
modifier turns off stemming. (In this specific example you wouldn't even need the modifier because Recoll doesn't expand sequences of numbers by default).
edited Apr 13 '17 at 12:23
Community♦
1
1
answered Jan 10 '15 at 14:25


Glutanimate
16.1k873131
16.1k873131
add a comment |
add a comment |
If the problem is just to limit the search to whole words, that is easy enough. Just add spaces before and after your search string, like so: " R1 "
. I use this trick in Evince all the time.
add a comment |
If the problem is just to limit the search to whole words, that is easy enough. Just add spaces before and after your search string, like so: " R1 "
. I use this trick in Evince all the time.
add a comment |
If the problem is just to limit the search to whole words, that is easy enough. Just add spaces before and after your search string, like so: " R1 "
. I use this trick in Evince all the time.
If the problem is just to limit the search to whole words, that is easy enough. Just add spaces before and after your search string, like so: " R1 "
. I use this trick in Evince all the time.
answered Mar 17 '15 at 9:31


Brian Z
561213
561213
add a comment |
add a comment |
Thanks for contributing an answer to Ask Ubuntu!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f571280%2fregex-pdf-search%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
nLlxpcvqLjBAJydI6Vqpq JORhaB9 L3RvnjJ0vyb7fSf8IXWrNiaz3adjDTrH16,HX,usP6sukvHwn
The question is not a duplicate because of the context of its use. My context requires a graphical PDF reader with regex search options so that I can quickly navigate to components of interest in a schematic. Cheers.
– Brian J Hoskins
Jan 8 '15 at 11:31