Find Number Of Records Within PDF
Find Number Of Records Within PDF
Hello folks! Is there a way to query a variable PDF for the total number of records contained therein? Each record within the PDF could contain a variable number of pages. Example: record 1 may contain 20 pages; record 2 may contain 24 pages; record 3 may contain 16 pages, etc.. Page 1 of every record contains a common unique phrase only found on page 1 such as a company tagline that could be searched on. Or, maybe, there could be something in the PDF structure itself that can identify the start of a new record throughout the entire file and report how many. I've worked with other transactional data software for printing that did it using OCR I believe. If not FastLane, maybe by another method? Thank you for your time!
Re: Find Number Of Records Within PDF
Hello,
You would certainly need to define what the record marker is. Could be a text you find, although it's risky if the text can be found many times.
Could be whatever makes the record identifiable. If there is a logo, you could use "Check if visual content exists" as well.
If by chance the file has bookmarks that point to every new record, FastLane could be used.
There are many ways to tackle this.
Loic
You would certainly need to define what the record marker is. Could be a text you find, although it's risky if the text can be found many times.
Could be whatever makes the record identifiable. If there is a logo, you could use "Check if visual content exists" as well.
If by chance the file has bookmarks that point to every new record, FastLane could be used.
There are many ways to tackle this.
Loic
Re: Find Number Of Records Within PDF
savvykong wrote: ↑Fri Dec 06, 2024 7:44 pm Hello folks! Is there a way to query a variable PDF for the total number of records contained therein? Each record within the PDF could contain a variable number of pages. Example: record 1 may contain 20 pages; record 2 may contain 24 pages; record 3 may contain 16 pages, etc.. Page 1 of every record contains a common unique phrase only found on page 1 such as a company tagline that could be searched on. Or, maybe, there could be something in the PDF structure itself that can identify the start of a new record throughout the entire file and report how many. I've worked with other transactional data software for printing that did it using OCR I believe. If not FastLane, maybe by another method? Thank you for your time!
Do something similar for mailing jobs that we break by mailing tray, pallet or bundle. We wrote a script and execute via run command that looks at a x/y location within the PDF for the data to change. It' chunks the data looking for the change, and once it sees it it goes back to find where it changed and records the page number/range. We pickup the StandardOutput which is the range and use that to split the PDF and we push those to be imposed, once each 'tray' is imposed we reassemble the entire job back so they are split for finishing downstream.
We also do similar for PO's we receive in where we process the info in the PDF where it changes to create orders.
Color Science & Workflow Automation
Re: Find Number Of Records Within PDF
@rhd_ole - what scripting tool are you using that looks at the x/y location in your PDF via the Run Command? Is it Powershell or some other third party command line tool that gives you this ability? This is very interesting to me, Thank you!
Re: Find Number Of Records Within PDF
@loicaigon - I wasn't aware of the "Check if visual content exists". I'm definitely looking into this, thanks a lot for this info!
Re: Find Number Of Records Within PDF
We use Powershell mainly for stuff like this yes. It's allowed us to do so much.
Color Science & Workflow Automation