Switch and big amount of dataset create sluggish behavior
Posted: Mon Sep 19, 2022 12:03 pm
Hello,
I'm currently working on a large flow for a client and I've encountered some slow behavior with Switch, so I did some investigating.
Here is the background:
XML is received, grouped into a batch by similar product, then the flow unroup the batch to process each file individually then groups/merges/imposes.
For 2000 files and about 20 processing steps in the flow, I had over 350,000 pieces of data in my switch folder!
I investigated and here is what I found:
When unbundling jobs, each previously attached dataset is duplicated (normal behavior).
When duplicating jobs (two arrows in a folder), all previously attached datasets are duplicated (normal behavior).
PitStop Server ALWAYS creates XML reports that in most cases are not wanted or requested, here is the list:
com.enfocus.PitStopServer.cli-config
com.enfocus.PitStopServer.cli-taskreport
com.enfocus.PitStopServer.cli-variableset (only if a variable set is used)
When regrouping the tasks after the process, the datasets are merged, but remain in the dataset folder!
I have created an application to delete the datasets when they are no longer used. Does anyone have a better idea?
Edit : Typos
I'm currently working on a large flow for a client and I've encountered some slow behavior with Switch, so I did some investigating.
Here is the background:
XML is received, grouped into a batch by similar product, then the flow unroup the batch to process each file individually then groups/merges/imposes.
For 2000 files and about 20 processing steps in the flow, I had over 350,000 pieces of data in my switch folder!
I investigated and here is what I found:
When unbundling jobs, each previously attached dataset is duplicated (normal behavior).
When duplicating jobs (two arrows in a folder), all previously attached datasets are duplicated (normal behavior).
PitStop Server ALWAYS creates XML reports that in most cases are not wanted or requested, here is the list:
com.enfocus.PitStopServer.cli-config
com.enfocus.PitStopServer.cli-taskreport
com.enfocus.PitStopServer.cli-variableset (only if a variable set is used)
When regrouping the tasks after the process, the datasets are merged, but remain in the dataset folder!
I have created an application to delete the datasets when they are no longer used. Does anyone have a better idea?
Edit : Typos