How to remove OCR from a PDF?
I have been searching Google for some time but cannot find an answer to my question.
I have unwanted layers of OCR in a document that I recently scanned with Adobe Acrobat. It has not been OCRed properly, and I want to redact some information, but the OCR is making the wanted information to get erased. I converted the files to TIFs, but noticed a (very) significant quality loss. I have heard that printing to another PDF either keeps the text or reduces the image quality.
I appreciate any help in solving this issue ASAP.
Thank You.
pdf adobe-acrobat ocr tif
add a comment |
I have been searching Google for some time but cannot find an answer to my question.
I have unwanted layers of OCR in a document that I recently scanned with Adobe Acrobat. It has not been OCRed properly, and I want to redact some information, but the OCR is making the wanted information to get erased. I converted the files to TIFs, but noticed a (very) significant quality loss. I have heard that printing to another PDF either keeps the text or reduces the image quality.
I appreciate any help in solving this issue ASAP.
Thank You.
pdf adobe-acrobat ocr tif
add a comment |
I have been searching Google for some time but cannot find an answer to my question.
I have unwanted layers of OCR in a document that I recently scanned with Adobe Acrobat. It has not been OCRed properly, and I want to redact some information, but the OCR is making the wanted information to get erased. I converted the files to TIFs, but noticed a (very) significant quality loss. I have heard that printing to another PDF either keeps the text or reduces the image quality.
I appreciate any help in solving this issue ASAP.
Thank You.
pdf adobe-acrobat ocr tif
I have been searching Google for some time but cannot find an answer to my question.
I have unwanted layers of OCR in a document that I recently scanned with Adobe Acrobat. It has not been OCRed properly, and I want to redact some information, but the OCR is making the wanted information to get erased. I converted the files to TIFs, but noticed a (very) significant quality loss. I have heard that printing to another PDF either keeps the text or reduces the image quality.
I appreciate any help in solving this issue ASAP.
Thank You.
pdf adobe-acrobat ocr tif
pdf adobe-acrobat ocr tif
edited Oct 12 '14 at 15:00
Sanoo
asked Oct 11 '14 at 6:32
SanooSanoo
1282521
1282521
add a comment |
add a comment |
6 Answers
6
active
oldest
votes
In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.
On running the command, it just searches out the hidden information but does not change the document. You must then tell Acrobat which information to remove. In this case, select "Hidden Text" in the Results pane, then click the Remove button and save the changed document.
I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.
– Sanoo
Apr 11 '17 at 4:20
This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.
– Nicholas Riley
Jan 21 '18 at 20:16
The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."
– user1125483
Sep 18 '18 at 10:38
add a comment |
After a lot of experimenting, I found that printing to Adobe PDF from Adobe Acrobat prints the document without the OCR and without losing the quality (an unnoticeable at first glance resolution is lost).
However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.
Please keep in mind that your mileage may vary.
Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.
add a comment |
In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone
add a comment |
In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.
add a comment |
(one year ago...)
If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:
Select Document, Examine Document and now you can remove the hidden text (OCR).
Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!
– Sanoo
Feb 19 '16 at 14:31
I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.
– Sanoo
Jul 17 '16 at 7:43
add a comment |
I built a tool to do this free PDF Redactor. If you upload the image and just click redact it'll flatten your pdf and remove OCR. If you want you can also draw redaction marks on the document as well.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f823808%2fhow-to-remove-ocr-from-a-pdf%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.
On running the command, it just searches out the hidden information but does not change the document. You must then tell Acrobat which information to remove. In this case, select "Hidden Text" in the Results pane, then click the Remove button and save the changed document.
I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.
– Sanoo
Apr 11 '17 at 4:20
This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.
– Nicholas Riley
Jan 21 '18 at 20:16
The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."
– user1125483
Sep 18 '18 at 10:38
add a comment |
In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.
On running the command, it just searches out the hidden information but does not change the document. You must then tell Acrobat which information to remove. In this case, select "Hidden Text" in the Results pane, then click the Remove button and save the changed document.
I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.
– Sanoo
Apr 11 '17 at 4:20
This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.
– Nicholas Riley
Jan 21 '18 at 20:16
The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."
– user1125483
Sep 18 '18 at 10:38
add a comment |
In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.
On running the command, it just searches out the hidden information but does not change the document. You must then tell Acrobat which information to remove. In this case, select "Hidden Text" in the Results pane, then click the Remove button and save the changed document.
In Acrobat Pro DC, the appropriate command is "Remove Hidden Information," which is available through both the "Protect" and "Redact" tools.
On running the command, it just searches out the hidden information but does not change the document. You must then tell Acrobat which information to remove. In this case, select "Hidden Text" in the Results pane, then click the Remove button and save the changed document.
edited Sep 22 '17 at 1:06
Warren Young
2,25711424
2,25711424
answered Apr 11 '17 at 4:11
user1125483user1125483
1313
1313
I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.
– Sanoo
Apr 11 '17 at 4:20
This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.
– Nicholas Riley
Jan 21 '18 at 20:16
The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."
– user1125483
Sep 18 '18 at 10:38
add a comment |
I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.
– Sanoo
Apr 11 '17 at 4:20
This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.
– Nicholas Riley
Jan 21 '18 at 20:16
The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."
– user1125483
Sep 18 '18 at 10:38
I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.
– Sanoo
Apr 11 '17 at 4:20
I have used the "remove hidden information", but for me for some reason that just removes parts of the image on certain pages. Thanks for your reply however.
– Sanoo
Apr 11 '17 at 4:20
This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.
– Nicholas Riley
Jan 21 '18 at 20:16
This is not universally true. Somehow (probably macOS PDFKit bugs) my ABBYY FineReader-OCRed text got corrupted, and checking "Hidden text" under Redact → Remove Hidden did remove the text without any issues; I was then able to successfully use Enhance Scans → Recognize Text to perform OCR within Acrobat itself.
– Nicholas Riley
Jan 21 '18 at 20:16
The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."
– user1125483
Sep 18 '18 at 10:38
The problem for me is that after I remove the hidden text, I'm still not able to run an OCR with "ClearScan" (i.e. "Editable Text and Images"). It's strange because the text layer appears to be gone, yet running OCR produces the error "Acrobat could not perform recognition because: page contains renderable text."
– user1125483
Sep 18 '18 at 10:38
add a comment |
After a lot of experimenting, I found that printing to Adobe PDF from Adobe Acrobat prints the document without the OCR and without losing the quality (an unnoticeable at first glance resolution is lost).
However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.
Please keep in mind that your mileage may vary.
Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.
add a comment |
After a lot of experimenting, I found that printing to Adobe PDF from Adobe Acrobat prints the document without the OCR and without losing the quality (an unnoticeable at first glance resolution is lost).
However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.
Please keep in mind that your mileage may vary.
Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.
add a comment |
After a lot of experimenting, I found that printing to Adobe PDF from Adobe Acrobat prints the document without the OCR and without losing the quality (an unnoticeable at first glance resolution is lost).
However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.
Please keep in mind that your mileage may vary.
Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.
After a lot of experimenting, I found that printing to Adobe PDF from Adobe Acrobat prints the document without the OCR and without losing the quality (an unnoticeable at first glance resolution is lost).
However, many sites claim that this does not work. I also tried the other printers such as Foxit Reader and OneNote but the quality was reduced. JPEG too was the same.
Please keep in mind that your mileage may vary.
Note: I am leaving this thread marked as unanswered in hope of finding a better answer than mine.
edited Oct 13 '14 at 7:53
answered Oct 13 '14 at 6:06
SanooSanoo
1282521
1282521
add a comment |
add a comment |
In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone
add a comment |
In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone
add a comment |
In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone
In Acrobat Pro: use 'remove hidden information' (under 'protection'). Select all, execute, OCR is gone
answered Oct 20 '16 at 15:55
jazzzzjazzzz
111
111
add a comment |
add a comment |
In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.
add a comment |
In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.
add a comment |
In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.
In Acrobat X, under Protection, there is a Sanitize Document button that removes EVERYTHING but what can be seen (including OCR'd text layer), converting the document to a flattened bit map.
edited Jan 30 '18 at 16:51
darthbith
340215
340215
answered Dec 14 '17 at 8:49
DaveDave
111
111
add a comment |
add a comment |
(one year ago...)
If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:
Select Document, Examine Document and now you can remove the hidden text (OCR).
Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!
– Sanoo
Feb 19 '16 at 14:31
I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.
– Sanoo
Jul 17 '16 at 7:43
add a comment |
(one year ago...)
If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:
Select Document, Examine Document and now you can remove the hidden text (OCR).
Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!
– Sanoo
Feb 19 '16 at 14:31
I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.
– Sanoo
Jul 17 '16 at 7:43
add a comment |
(one year ago...)
If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:
Select Document, Examine Document and now you can remove the hidden text (OCR).
(one year ago...)
If, as you say, the documents are scanned and not printed to PDF from Word for example, you can easily remove with your Adobe:
Select Document, Examine Document and now you can remove the hidden text (OCR).
answered Dec 10 '15 at 10:50
FranFran
1
1
Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!
– Sanoo
Feb 19 '16 at 14:31
I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.
– Sanoo
Jul 17 '16 at 7:43
add a comment |
Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!
– Sanoo
Feb 19 '16 at 14:31
I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.
– Sanoo
Jul 17 '16 at 7:43
Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!
– Sanoo
Feb 19 '16 at 14:31
Thanks for your reply. I'll test it out as soon as I can and let you know. Thanks for the answer!
– Sanoo
Feb 19 '16 at 14:31
I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.
– Sanoo
Jul 17 '16 at 7:43
I thought I already commented on this, but the problem is that I have Acrobat DC Pro, and those menus have been removed. Thanks for your answer anyway.
– Sanoo
Jul 17 '16 at 7:43
add a comment |
I built a tool to do this free PDF Redactor. If you upload the image and just click redact it'll flatten your pdf and remove OCR. If you want you can also draw redaction marks on the document as well.
add a comment |
I built a tool to do this free PDF Redactor. If you upload the image and just click redact it'll flatten your pdf and remove OCR. If you want you can also draw redaction marks on the document as well.
add a comment |
I built a tool to do this free PDF Redactor. If you upload the image and just click redact it'll flatten your pdf and remove OCR. If you want you can also draw redaction marks on the document as well.
I built a tool to do this free PDF Redactor. If you upload the image and just click redact it'll flatten your pdf and remove OCR. If you want you can also draw redaction marks on the document as well.
edited Jan 31 at 8:19
answered Jan 31 at 7:31
levinologylevinology
1113
1113
add a comment |
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f823808%2fhow-to-remove-ocr-from-a-pdf%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown