Assemble Job variable number of files

jan_suhr · Post by **jan_suhr** » Sun Mar 06, 2016 5:49 pm

I working on a flow were there are a number of files in a job that needs to be assembled for a PDF-merge. The number of files varies since they are created by importing XML to InDesign. Each job has a different number of files.

How can I get the number of files?

I was thinking of some kind of counter that records the number of files that passes a point in the flow were the job.PrivateData is set from the filename of the XML that enters the flow. There is one XML entering the flow but individual PDF-files are output from InDesign. Several XML-jobs can run at the same time or after each other.

I suppose this has to be done with scripting?

Thanks

Jan

gabrielp · Post by **gabrielp** » Mon Mar 07, 2016 3:16 am

I'm a little confused about the "there is one XML entering the flow" vs the several XML jobs running at the same time bit.

But... you can assemble on a key on an arbitrary time. So if you are dealing with several files coming in and you can say "they will all probably come in within 15 minutes of each other" you can assemble by the job number (lets say, in private data) after 20 minutes and be pretty confident they are all going in there. The downside, of course, is that it adds a delay to your flow.

Hope this helps...

loicaigon · Post by **loicaigon** » Mon Mar 07, 2016 9:59 am

Hi,

If your xml import is part of the flow (meaning using the indesign configurator), I may suggest to do all the indesign/pdf files in one command and output them in a folder. Eventually the script will move that folder into another hotfolder of your flow so it becomes a "folder" job and switch can event return the number of items in that job.
I find it easier to do it that way when I need to keep an overview of many things to be outputted from InDesign. And to have InDesign executed you can call a dummy indesign file for the opening process and close it and remove it.

HTH

Loic
http://www.ozalto.com

jan_suhr · Post by **jan_suhr** » Tue Mar 08, 2016 3:34 pm

Thanks Loic, that sounds like a good idea.

Gabriel, the input to the flow is one Excel-file with a number of records. Each record becomes a single XML that then is imported to InDesign to produce single PDF's. The number of single PDF's can be different from each Excel-file and they later has to be assembled to a single PDF and have backsides added to each page.

So I need the number of files that the Assemble Job element has to pick up.

Thanks

Jan

gabrielp · Post by **gabrielp** » Tue Mar 08, 2016 3:52 pm

jan_suhr wrote:So I need the number of files that the Assemble Job element has to pick up.

Ok. If you can't work that out, the method I described above lets you assemble these PDFs together if you cannot determine the total number of expected files to assemble if a delay is acceptable

Arthur · Post by **Arthur** » Tue Oct 31, 2017 10:16 pm

Hello;
I have a very similar situation with a flow where I cannot determine the number of files to be assembled, as that each time is going to be different.
The more that my merge point comes from 2 flows (one generates individual PDFs based on a XLSX table input, through XML generated from the Excel file, the second is just a flow for the initial set of files, which individually can drop anywhere along the way, due to numerous checks and manipulations which they can fail on and thus go to PROBLEM JOBS).
So the initial number of files does not match the final. As they all come down the pipe in different time, a delay scheme does not work.
One time 2 minutes of a delay would do for all of them, the other time it may be 15-20 minutes if the files are big enough to take more time to be processed.
Therefore I am looking for a counter or any other solution that would hold the jobs based on the PrivateData / Original.Name ideally, before releasing the files for assembling and merging subsequently. I imagine there must be a way of counting them as they go along.
I tried grouping them on various stages and using Job.FileCount to then export it to an XML, but that adds additional 5-10 minutes for assembling files in each individual flow, so unnecessarily adds to the overall time for processing.
On the other hand - I still cannot get the data from those 2 XMLs successfully injected, so when all the files are coming down to the common folder they would get that metadata (but this is because I do not quite understand how the Inject works)

So if there is a way of getting this sorted minimising the time it takes ... I would appreciate any ideas

I do not have scripting module.
Any solutions / ideas ?? Plss..... This is getting to be a nightmare.

loicaigon · Post by **loicaigon** » Wed Nov 01, 2017 7:23 pm

Something you could try is actually to embed the file count in the indd xmp. Then they will survive as XMP in teh PDF and you can use this to assemble them. But it requires that somehow you can foresee the amount of files at the indesign stage.
If the data can be passed in the xml file, it becomes easy.

Here is a possible approach…

Arthur · Post by **Arthur** » Fri Nov 03, 2017 10:38 am

loicaigon wrote:Something you could try is actually to embed the file count in the indd xmp. Then they will survive as XMP in teh PDF and you can use this to assemble them. But it requires that somehow you can foresee the amount of files at the indesign stage.
If the data can be passed in the xml file, it becomes easy.
Here is a possible approach…

The thing is I am not talking about InDesign here.
The files are a mixture of MS Office & PDFs + PDFs generated in Switch based on the XML file (which again gets created due to humble help of Excel to XMl App

So no In Design at all, no XMP...
One flow is generating individual PDFs from 1 PDF which is split on a per page basis and then the XML comes into play, the other flow is converting MS Office documents.
Both process them accordingly using Pitstop actions / preflights etc, to have them put together in one folder with the use of portals, combining the files in one place, after which they are suppose to be merged in alphabetic / file name order.
As the operation may take a while and is split between 2 flows, where as I said the number of files may change as they go along, the Hold Jobs or Assembler based purely on Time Delay function - does not fit the purpose here.
All files are marked with Original Job Name as Private Data, but this is insufficient, as the Assembler or Hold Jobs are still Time Delay Driven.
So It happens that only a part of the files is merged, as not all of them managed to come down to the common folder for merging in time, yet the delay frame passed and so they were release for merging, regardless of the mentioned identifier, held as PrivateData.

Switch is not so intelligent as to know whether all the files from initial job (which in my case is not 1 file = 1 job, but 1 job = tens or hundreds of files) have been processed and can be merged. This is why a counter would be of help

loicaigon · Post by **loicaigon** » Fri Nov 03, 2017 11:18 am

Sure. At some degree, Switch needs that you give this info to him. Would that info be accessible in a database ?
A part from that…

Padawan · Post by **Padawan** » Fri Nov 03, 2017 5:43 pm

Some more ideas on how you can try to get the picked up amount of files:
1 Xpath Expression
The XML file contains all the information about all the jobs you will pickup. I assume you use this XML file as an XML dataset?
If so, then you should be able to build a Switch variable which uses an XPath expression which will count the amount of files. You can then store this in private data at this point in the flow for later usage.

To build the Switch variable you will need a metadata variable and use "XPath Expression" instead of "XML Location Path". Now you can use XPath functions. The function you'll need is the count() function.
https://msdn.microsoft.com/en-us/librar ... .110).aspx

If you can share the syntax of the XML file, then I can help with the XPath

2 Number of files in a folder.

The files are a mixture of MS Office & PDFs + PDFs generated in Switch based on the XML file

How does this work? If they output all the files in a jobfolder, and then you use Ungroup job, then the amount of files of that jobfolder is stored in the Ungroup.NumFiles private data field.

(Btw, Ungroup Job adds more information which you can re-use/abuse. It is all described here:
http://www.enfocus.com/manuals/UserGuid ... p_job.html )

Padawan · Post by **Padawan** » Fri Nov 03, 2017 5:45 pm

That might help, but I assume that the bigger problem is that the number of files at the start of the flow changes thru the flow. But I'm not really sure how a counter would help? When would the counter know it has to start counting and it's information can be given to Assemble Job?

When the first part of the job-to-be-assembled arrives at the assemble job, then another part might still need to be split or merged somewhere earlier in the flow.

Or am I misunderstanding your usecase?

Arthur · Post by **Arthur** » Tue Nov 14, 2017 10:09 am

I think I sorted it out

It seems easier than I thought. I mean - I complicated it way too much at the beginning, yet the solution was very simple

First step was to make sure that no file drops out due to some silly reasons / preflight checks etc.

Second step - at the Input point I simply added a path to a separate flow (call it counter) and hold element to stop the files in the main flow for say 30sec. from being processed.
In the counter flow - job folder is ungroupped, and as per the Switch Reference Guide, this configurator stores the information on how many files was in the injected job in the Private Data
<key>.NumFiles [The total number of files injected in the flow for this parent job].

Third step- I filtered the file which, as I said before, is to be split on a per page basis and counted the number of pages - this is equal to the number of files it would produce in the main flow (based on the XML file produced of a XLSX injected with it).

All of this data is then picked up by the Take Notes application and exported to an XML file.
The whole operation takes ca. 5-10 seconds tops.

So the XML can be injected at the latest stage before the assembler comes into play and the number of files to assemble can be a simple calculation based on the info stored in the XML and injected into the coming job, as each file would have this metadata coming with it

So whichever file comes down the pipe first it will trigger the injection of the XML and voila.

This works in theory, however now comes the struggle with Inject Job Lite, as I completely do not understand how it works. I mean it literally works differently to what I would expect or what I can get my head round. But this is another story.
Counter sorted

Thank you everyone for your help, especially Padawan, as without your reference to the Ungroup element documentation, I would not get it sorted.

Padawan · Post by **Padawan** » Tue Nov 14, 2017 1:38 pm

Glad I could help!

Enfocus Community

Assemble Job variable number of files

Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files

Re: Assemble Job variable number of files