Extract Text from PDF for Metadata/Private Data Use

Post Reply
GSBSwitch1
Member
Posts: 25
Joined: Thu Jan 07, 2016 8:42 pm

Extract Text from PDF for Metadata/Private Data Use

Post by GSBSwitch1 »

Is there a way to search for and then extract and use a string of text from a PDF file to be used within Switch as either metadata or private data (or really anyhow) so that I can rename the PDF file? I want the name of the incoming PDF to be named by the string of text within the PDF.

I can search for the text segment in Pitstop but I couldn't figure out a way to maybe log the variable text selection. It seems simple enough but I am stuck on finding a solution.

--Evan
abailescollins
Advanced member
Posts: 458
Joined: Wed Apr 22, 2015 4:28 pm

Re: Extract Text from PDF for Metadata/Private Data Use

Post by abailescollins »

I think we can do this by getting the text you need reported in the XML report so you can access it, and then use it as a variable to rename the file.
Maybe drop me a mail with some examples of what you want to do, and we can experiment.

I seem to remember doing this before with a customer.
Head of Product Management @ Ultimate
abc@imposition.com
GSBSwitch1
Member
Posts: 25
Joined: Thu Jan 07, 2016 8:42 pm

Re: Extract Text from PDF for Metadata/Private Data Use

Post by GSBSwitch1 »

Andrew -- I emailed you directly with example file to test.
GSBSwitch1
Member
Posts: 25
Joined: Thu Jan 07, 2016 8:42 pm

Re: Extract Text from PDF for Metadata/Private Data Use

Post by GSBSwitch1 »

Curious if there is any other idea in how to accomplish this. I am starting with a CSV file that will trigger a SmartStream Deisgn VDP config which creates a multi page/record PDF. I then split that PDF into individual PDFs per record (every 8 pages of the original PDF). I then need to name that personalized PDF by the string "EMPLID_Lastname_FinAid2021.PDF". All is static except the last name which is all from the original CSV.

The one idea I was trying is to write that string within Indesign SmartStream Designer template into the live area so that I could hopefully search for it and then use it within Switch to rename the PDF name.

Appreciate any ideas.

--Evan
mpeapell
Newbie
Posts: 3
Joined: Fri Aug 11, 2017 10:42 am
Location: Swindon, UK

Re: Extract Text from PDF for Metadata/Private Data Use

Post by mpeapell »

Hey Evan,

Did you get a resolution on this? I am trying to do something similar, where I need to use PitStop to read some text in the PDF then get Switch to push it in a specific direction.

Martin
Martin :D
User avatar
magnussandstrom
Advanced member
Posts: 345
Joined: Thu Jul 30, 2020 6:34 pm
Location: Sweden
Contact:

Re: Extract Text from PDF for Metadata/Private Data Use

Post by magnussandstrom »

There is an option to 'Select text containing' in Pitstop. Then you can log selection as warning.

For routing the file you can use the Pitstop XML-metadata or the orange warning outgoing connection.

You can use a Switch variable if your specific phrase is variable.

phrase.png
phrase.png (154.72 KiB) Viewed 5749 times
Last edited by magnussandstrom on Wed Aug 23, 2023 12:09 pm, edited 1 time in total.
mpeapell
Newbie
Posts: 3
Joined: Fri Aug 11, 2017 10:42 am
Location: Swindon, UK

Re: Extract Text from PDF for Metadata/Private Data Use

Post by mpeapell »

Thank you... this is perfect!
Martin :D
Post Reply