Determine files missing from a group
Posted: Thu Jul 20, 2023 10:55 pm
I have a flow that uses Split PDF to create single page PDFs. These go through a third party app to be processed (I use the "Generic Application" element for this). Once processed the single page PDFs are regrouped and then I merge them into a single PDF again.
There are occasions where one (or more) of the single page PDFs fail to process in the third party app.
For example, a 20 page PDF is submitted to the flow. I end up with 20 single page PDFs. Page 18 fails to process. The other 19 wait for regrouping. The orphan time out passes and I end up with a 19 page PDF when all is done.
So I am trying to think of a quick and easy way to know which page failed. My initial idea is extremely cumbersome. I will detail it below and hopefully someone will suggest something more efficient.
1. The multi-page PDF starts in the flow. I capture the unique ID and save it as private data tag. I also save the page count as a private data tag.
2. After split PDF, my single pages will continue to the third party app. But they will also continue into a separate repository, just a temporary holding location. Their names will include the unique ID that I earlier captured as a private data tag.
3. After the single pages are processed by the third party app, they will continue on to be regrouped. But, they will also continue on a different path, to inject the PDFs that were earlier placed in the repository. So.. if any of these inject the repository PDFs, all is well, after all those pages were successfully processed. After injection, no reason to keep these, they can be recycled.
5. Meanwhile, the processed single page PDFs have been regrouped, then merged back into a multi page PDF. If the page count of this matches the page count I originally captured as private data, all is well, it means I have all my processed pages in the resulting multi page PDF
6. But if the page count does not match, the fun begins. I need my multi page to inject any pages that failed to process. These would still be in the repository, having never had a processed single page PDF to inject them. I could use Inject Wildcard to inject my failures, in the event that more than one page failed to process.
Anyway if you follow, this would work, I would have my incomplete multi-page PDF, and also my single page failures. But maybe a quicker way to achieve this end result.
There are occasions where one (or more) of the single page PDFs fail to process in the third party app.
For example, a 20 page PDF is submitted to the flow. I end up with 20 single page PDFs. Page 18 fails to process. The other 19 wait for regrouping. The orphan time out passes and I end up with a 19 page PDF when all is done.
So I am trying to think of a quick and easy way to know which page failed. My initial idea is extremely cumbersome. I will detail it below and hopefully someone will suggest something more efficient.
1. The multi-page PDF starts in the flow. I capture the unique ID and save it as private data tag. I also save the page count as a private data tag.
2. After split PDF, my single pages will continue to the third party app. But they will also continue into a separate repository, just a temporary holding location. Their names will include the unique ID that I earlier captured as a private data tag.
3. After the single pages are processed by the third party app, they will continue on to be regrouped. But, they will also continue on a different path, to inject the PDFs that were earlier placed in the repository. So.. if any of these inject the repository PDFs, all is well, after all those pages were successfully processed. After injection, no reason to keep these, they can be recycled.
5. Meanwhile, the processed single page PDFs have been regrouped, then merged back into a multi page PDF. If the page count of this matches the page count I originally captured as private data, all is well, it means I have all my processed pages in the resulting multi page PDF
6. But if the page count does not match, the fun begins. I need my multi page to inject any pages that failed to process. These would still be in the repository, having never had a processed single page PDF to inject them. I could use Inject Wildcard to inject my failures, in the event that more than one page failed to process.
Anyway if you follow, this would work, I would have my incomplete multi-page PDF, and also my single page failures. But maybe a quicker way to achieve this end result.