Routing jobs based on lookup from data file

Post Reply
jstotz
Newbie
Posts: 10
Joined: Thu Jan 22, 2015 7:47 pm

Routing jobs based on lookup from data file

Post by jstotz »

I currently have a flow that routes jobs(PDF files) based on part of the file name(job.name). They all start in one input folder and are routed to one of 9 different output folders. Each connector is using a "condition with variables defined" that contains multiple conditions OR'd together in the form of:
[job.name] contains STRING1
OR [job.name] contains STRING2
OR [job.name] contains STRING3
etc.

A more real world example would be:
Output folders:
Mammal
Fish
Bird

Input files:
Dog.pdf
Salmon.pdf
Cat.pdf
Robin.pdf
Starling.pdf
Flounder.pdf
Cow.pdf
Bass.pdf
Horse.pdf


Since I have 9 output folders and about 100 strings distributed among them, it is getting hard to maintain. The list of strings keeps growing. Every week I add one or two new strings to look for.

I want to redo the flow in a way that would make it easier to maintain by having one central list of strings. I was thinking of making a text file that would be used to determine which file went to each output folder. I figured that the file would be a list of key/value pairs (string, output folder) that could be read by a script and used to route the files.

The script might look something like this: (not real code)

Method A
read all the lines from the data file into an array
for each element in array
if jobname.contains( element.key ) then
output = element.value
exit loop
end if
end

sendto( output )

Method B
Data file is grouped by output folder. Could be a separate data file for each folder. Would just have strings, no output values for each line.

read all the lines from the data file(s) but put each group into a separate array for each output folder
array1
array2
array3

str = extract the part of the file name that will determine the output folder (for example 4 chars after the first underscore)
For each array
if array.contains( str ) then
output = array#
exit loop
end if
end

sendto( output )


Method B is probably more efficient because it has less array elements to check, however the problem with both methods is that it has to re-read the data file for each PDF that comes through the flow.

Is there a way to read the data file into the flow once when the flow is started and have that info in memory all the time?

Is there a better approach?

Thanks.
loicaigon
Advanced member
Posts: 372
Joined: Wed Jul 10, 2013 10:22 am

Re: Routing jobs based on lookup from data file

Post by loicaigon »

By using Regular Expressions, you could already narrow the scope and limit outgoing connections.

In your demo case, you could have only 3/4 outgoing connections (think of Mammals, Fishes, Birds & unknown). Then with regular expressions you can filter files based on their names.
For exemple, the Mammals incoming connection would be set such as :
(Cat|Dog|Horse|Cow)\.pdf

But if you prefer rely on an external file, you will have to use a script expression (which requires the scripting module):

Code: Select all

var f = new File ( "/Users/ozalto/Desktop/mammals.txt" );
var c = File.read("/Users/ozalto/Desktop/mammals.txt", "UTF-8");
var names = c.split("\n");
var reg  = new RegExp("("+names.join("|")+")");
var jobName = job.getNameProper();
jobName.match ( reg );
reg.matchedLength > 0;
The later has the advantage that you can append the txt file outside of Switch and have it use it syncd.
Post Reply