Page 1 of 1

Extract Text from PDF for Metadata/Private Data Use

Posted: Tue Jan 05, 2021 10:24 pm
by GSBSwitch1
Is there a way to search for and then extract and use a string of text from a PDF file to be used within Switch as either metadata or private data (or really anyhow) so that I can rename the PDF file? I want the name of the incoming PDF to be named by the string of text within the PDF.

I can search for the text segment in Pitstop but I couldn't figure out a way to maybe log the variable text selection. It seems simple enough but I am stuck on finding a solution.

--Evan

Re: Extract Text from PDF for Metadata/Private Data Use

Posted: Wed Jan 06, 2021 10:50 am
by abailescollins
I think we can do this by getting the text you need reported in the XML report so you can access it, and then use it as a variable to rename the file.
Maybe drop me a mail with some examples of what you want to do, and we can experiment.

I seem to remember doing this before with a customer.

Re: Extract Text from PDF for Metadata/Private Data Use

Posted: Wed Jan 06, 2021 4:08 pm
by GSBSwitch1
Andrew -- I emailed you directly with example file to test.

Re: Extract Text from PDF for Metadata/Private Data Use

Posted: Fri Jan 08, 2021 3:52 pm
by GSBSwitch1
Curious if there is any other idea in how to accomplish this. I am starting with a CSV file that will trigger a SmartStream Deisgn VDP config which creates a multi page/record PDF. I then split that PDF into individual PDFs per record (every 8 pages of the original PDF). I then need to name that personalized PDF by the string "EMPLID_Lastname_FinAid2021.PDF". All is static except the last name which is all from the original CSV.

The one idea I was trying is to write that string within Indesign SmartStream Designer template into the live area so that I could hopefully search for it and then use it within Switch to rename the PDF name.

Appreciate any ideas.

--Evan

Re: Extract Text from PDF for Metadata/Private Data Use

Posted: Tue Aug 22, 2023 11:00 am
by mpeapell
Hey Evan,

Did you get a resolution on this? I am trying to do something similar, where I need to use PitStop to read some text in the PDF then get Switch to push it in a specific direction.

Martin

Re: Extract Text from PDF for Metadata/Private Data Use

Posted: Tue Aug 22, 2023 1:34 pm
by magnussandstrom
There is an option to 'Select text containing' in Pitstop. Then you can log selection as warning.

For routing the file you can use the Pitstop XML-metadata or the orange warning outgoing connection.

You can use a Switch variable if your specific phrase is variable.

phrase.png
phrase.png (154.72 KiB) Viewed 6587 times

Re: Extract Text from PDF for Metadata/Private Data Use

Posted: Wed Aug 23, 2023 10:53 am
by mpeapell
Thank you... this is perfect!