Find Number Of Records Within PDF

Post Reply
savvykong
Newbie
Posts: 8
Joined: Fri Apr 05, 2024 8:22 pm

Find Number Of Records Within PDF

Post by savvykong »

Hello folks! Is there a way to query a variable PDF for the total number of records contained therein? Each record within the PDF could contain a variable number of pages. Example: record 1 may contain 20 pages; record 2 may contain 24 pages; record 3 may contain 16 pages, etc.. Page 1 of every record contains a common unique phrase only found on page 1 such as a company tagline that could be searched on. Or, maybe, there could be something in the PDF structure itself that can identify the start of a new record throughout the entire file and report how many. I've worked with other transactional data software for printing that did it using OCR I believe. If not FastLane, maybe by another method? Thank you for your time!
loicaigon
Advanced member
Posts: 524
Joined: Wed Jul 10, 2013 10:22 am

Re: Find Number Of Records Within PDF

Post by loicaigon »

Hello,

You would certainly need to define what the record marker is. Could be a text you find, although it's risky if the text can be found many times.
Could be whatever makes the record identifiable. If there is a logo, you could use "Check if visual content exists" as well.
If by chance the file has bookmarks that point to every new record, FastLane could be used.

There are many ways to tackle this.

Loic
rhd_ole
Member
Posts: 148
Joined: Mon Jan 24, 2022 5:36 pm

Re: Find Number Of Records Within PDF

Post by rhd_ole »

savvykong wrote: Fri Dec 06, 2024 7:44 pm Hello folks! Is there a way to query a variable PDF for the total number of records contained therein? Each record within the PDF could contain a variable number of pages. Example: record 1 may contain 20 pages; record 2 may contain 24 pages; record 3 may contain 16 pages, etc.. Page 1 of every record contains a common unique phrase only found on page 1 such as a company tagline that could be searched on. Or, maybe, there could be something in the PDF structure itself that can identify the start of a new record throughout the entire file and report how many. I've worked with other transactional data software for printing that did it using OCR I believe. If not FastLane, maybe by another method? Thank you for your time!

Do something similar for mailing jobs that we break by mailing tray, pallet or bundle. We wrote a script and execute via run command that looks at a x/y location within the PDF for the data to change. It' chunks the data looking for the change, and once it sees it it goes back to find where it changed and records the page number/range. We pickup the StandardOutput which is the range and use that to split the PDF and we push those to be imposed, once each 'tray' is imposed we reassemble the entire job back so they are split for finishing downstream.

We also do similar for PO's we receive in where we process the info in the PDF where it changes to create orders.
Color Science & Workflow Automation
savvykong
Newbie
Posts: 8
Joined: Fri Apr 05, 2024 8:22 pm

Re: Find Number Of Records Within PDF

Post by savvykong »

@rhd_ole - what scripting tool are you using that looks at the x/y location in your PDF via the Run Command? Is it Powershell or some other third party command line tool that gives you this ability? This is very interesting to me, Thank you!
savvykong
Newbie
Posts: 8
Joined: Fri Apr 05, 2024 8:22 pm

Re: Find Number Of Records Within PDF

Post by savvykong »

@loicaigon - I wasn't aware of the "Check if visual content exists". I'm definitely looking into this, thanks a lot for this info!
rhd_ole
Member
Posts: 148
Joined: Mon Jan 24, 2022 5:36 pm

Re: Find Number Of Records Within PDF

Post by rhd_ole »

savvykong wrote: Mon Dec 09, 2024 5:27 pm @rhd_ole - what scripting tool are you using that looks at the x/y location in your PDF via the Run Command? Is it Powershell or some other third party command line tool that gives you this ability? This is very interesting to me, Thank you!
We use Powershell mainly for stuff like this yes. It's allowed us to do so much.
Color Science & Workflow Automation
Post Reply