Extracting data from xml

Post Reply
LasseThid
Advanced member
Posts: 353
Joined: Tue Mar 03, 2015 2:30 pm
Location: Molndal, Sweden

Extracting data from xml

Post by LasseThid »

I'm working on a flow that creates impositions using Phoenix.
As part of the output I can get an xml file. The xml file has information on all products that is part of this particular imposition.
Is it possible to somehow filter the xml file and only keep the job id (P107586_500.PDF in this case) and the names of all products?
The number of products in the xml will vary with every imposition made.

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<job>
   <id>P107586_500.PDF</id>
   <name></name>
   <contact></contact>
   <phone></phone>
   <client></client>
   <notes></notes>
   <default-bleed>5mm</default-bleed>
   <units>mm</units>
   <run-length>53</run-length>
   <press-minutes>47</press-minutes>
   <plate-cost>0.0</plate-cost>
   <stock-cost>4849.6650390625</stock-cost>
   <press-cost>1175.0</press-cost>
   <die-cost>0.0</die-cost>
   <total-cost>6024.6650390625</total-cost>
   <waste>0.34245550632476807</waste>
   <sheet-usage>0.6751038432121277</sheet-usage>
   <underrun>0.0</underrun>
   <overrun>0.044802866876125336</overrun>
   <layout-count>1</layout-count>
   <layouts>
      <layout>
         <index>1</index>
         <name>Layout 1</name>
         <workstyle>FlatWork</workstyle>
         <run-length>53</run-length>
         <press-minutes>47</press-minutes>
         <plates>4</plates>
         <plate-cost>0.0</plate-cost>
         <stock-cost>4849.6650390625</stock-cost>
         <press-cost>1175.0</press-cost>
         <die-cost>0.0</die-cost>
         <total-cost>6024.6650390625</total-cost>
         <waste>0.34245550632476807</waste>
         <sheet-usage>0.6751038432121277</sheet-usage>
         <underrun>0.0</underrun>
         <overrun>0.044802866876125336</overrun>
         <product-count>3</product-count>
         <tool-stats>
            <categories>
               <category>
                  <name>Cut</name>
                  <length>17756.4604mm</length>
               </category>
               <category>
                  <name>Crease</name>
                  <length>0mm</length>
               </category>
            </categories>
         </tool-stats>
         <priority-stats>
            <priority-stat>
               <priority>1</priority>
               <sheet-usage>0.67510384</sheet-usage>
            </priority-stat>
         </priority-stats>
         <surfaces>
            <surface>
               <side>Front</side>
               <press>
                  <name>Plano Skärbord</name>
                  <id>9c8883a0-5391-4fd9-85fc-81f4de7e976d</id>
               </press>
               <stock>
                  <name>Kanalplast</name>
                  <id>38fba052-222f-44bc-b3e1-38c800a8168c</id>
               </stock>
               <grade>
                  <name>1780 gsm</name>
                  <id>f08a4792-791a-4736-aec5-bbc3aae2c5b4</id>
               </grade>
               <sheet>
                  <name>1600 x 1780mm Long</name>
                  <id>08e98c5f-a17c-42e0-a244-b629e06b975e</id>
                  <width>1600mm</width>
                  <height>1780mm</height>
               </sheet>
               <inks>
                  <ink>
                     <name>Black</name>
                     <separation>false</separation>
                     <type>normal</type>
                  </ink>
                  <ink>
                     <name>Cyan</name>
                     <separation>false</separation>
                     <type>normal</type>
                  </ink>
                  <ink>
                     <name>Magenta</name>
                     <separation>false</separation>
                     <type>normal</type>
                  </ink>
                  <ink>
                     <name>Yellow</name>
                     <separation>false</separation>
                     <type>normal</type>
                  </ink>
                  <ink>
                     <name>Cut</name>
                     <separation>true</separation>
                     <type>cut</type>
                  </ink>
               </inks>
            </surface>
         </surfaces>
      </layout>
   </layouts>
   <products>
      <product>
         <name>P107586_400.PDF</name>
         <index>1</index>
         <ordered>150</ordered>
         <die-source>_11F7C_P107586_400.PDF</die-source>
         <die-path>C:/Users/Admin/AppData/Local/Temp/phoenix-c9ce838b-cd34-4f72-8621-f623c3e516db/P107586_500PDF-4116310353162221012/upload/1/_11F7C_P107586_400.PDF</die-path>
         <stock>Kanalplast</stock>
         <grade>1780 gsm</grade>
         <grain>None</grain>
         <width>399.5282mm</width>
         <height>463.6602mm</height>
         <spacing-type>Uniform</spacing-type>
         <group>Kanalplast_1780 gsm</group>
         <priority>1</priority>
         <rotation>Orthogonal</rotation>
         <templates/>
         <placed>3</placed>
         <total>159</total>
         <overrun>9</overrun>
         <layouts>
            <layout index="1" placed="3"/>
         </layouts>
      </product>
      <product>
         <name>P107587_300.PDF</name>
         <index>2</index>
         <ordered>39</ordered>
         <die-source>_11F7E_P107586_300.PDF</die-source>
         <die-path>C:/Users/Admin/AppData/Local/Temp/phoenix-c9ce838b-cd34-4f72-8621-f623c3e516db/P107586_500PDF-4116310353162221012/upload/2/_11F7E_P107586_300.PDF</die-path>
         <stock>Kanalplast</stock>
         <grade>1780 gsm</grade>
         <grain>None</grain>
         <width>299.4127mm</width>
         <height>347.5097mm</height>
         <spacing-type>Uniform</spacing-type>
         <group>Kanalplast_1780 gsm</group>
         <priority>1</priority>
         <rotation>Orthogonal</rotation>
         <templates/>
         <placed>1</placed>
         <total>53</total>
         <overrun>14</overrun>
         <layouts>
            <layout index="1" placed="1"/>
         </layouts>
      </product>
      <product>
         <name>P107588s_500.PDF</name>
         <index>3</index>
         <ordered>369</ordered>
         <die-source>_11F7A_P107586_500.PDF</die-source>
         <die-path>C:/Users/Admin/AppData/Local/Temp/phoenix-c9ce838b-cd34-4f72-8621-f623c3e516db/P107586_500PDF-4116310353162221012/upload/3/_11F7A_P107586_500.PDF</die-path>
         <stock>Kanalplast</stock>
         <grade>1780 gsm</grade>
         <grain>None</grain>
         <width>499.413mm</width>
         <height>579.5782mm</height>
         <spacing-type>Uniform</spacing-type>
         <group>Kanalplast_1780 gsm</group>
         <priority>1</priority>
         <rotation>Orthogonal</rotation>
         <templates/>
         <placed>7</placed>
         <total>371</total>
         <overrun>2</overrun>
         <layouts>
            <layout index="1" placed="7"/>
         </layouts>
      </product>
   </products>
</job>
Thanks
Lasse
Enfocus Switch, Enfocus PitStop Server, Enfocus PDF Review, HP SmartStream& Kodak Prinergy with RBA
Offset 72x102, Offset Large Format, Digital Large Format and Digital print.
cstevens
Member
Posts: 103
Joined: Tue Feb 12, 2013 8:42 pm

Re: Extracting data from xml

Post by cstevens »

How do you want the information saved? Do you want a new XML file with just that information in it, or are you trying to pull it into a script for use elsewhere?

If you want a new XML file then an xsl transform is probably the easiest option.

If the later then you can do something like this:

Code: Select all

	var xmlDoc = new Document(job.getPath());
	var jobId = xmlDoc.evalToString("//job/id");
	s.log(1, "JobID is: " + jobId);
	var productNames = xmlDoc.evalToNodes("//job/products/product/name");
	for(var i=0; i<productNames.length; i++){
		s.log(1, "ProductName " + i + " is: " + productNames.at(i).evalToString("./text()"));
	}
cstevens
Member
Posts: 103
Joined: Tue Feb 12, 2013 8:42 pm

Re: Extracting data from xml

Post by cstevens »

It would probably be easier to add product name values to an array like this:

Code: Select all

	var xmlDoc = new Document(job.getPath());
	var jobId = xmlDoc.evalToString("//job/id");
	s.log(1, "JobID is: " + jobId);
	var productNames = xmlDoc.evalToNodes("//job/products/product/name");
	var namesArray = [];
	for(var i=0; i<productNames.length; i++){
		//s.log(1, "ProductName " + i + " is: " + productNames.at(i).evalToString("./text()"));
		namesArray.push(productNames.at(i).evalToString("./text()"));
	}
	s.log(1, "List of ProductNames is: " + namesArray.toString());
sander
Advanced member
Posts: 274
Joined: Wed Oct 01, 2014 8:58 am
Location: The Netherlands

Re: Extracting data from xml

Post by sander »

cstevens wrote:If you want a new XML file then an xsl transform is probably the easiest option.
If you might need this, Lars, it’s already done here: viewtopic.php?f=12&t=1201&p=4080&hilit=Xslt+split#p4080

I’m using the Saxon configurator now to do the same. If you want I’ll post next monday.
LasseThid
Advanced member
Posts: 353
Joined: Tue Mar 03, 2015 2:30 pm
Location: Molndal, Sweden

Re: Extracting data from xml

Post by LasseThid »

The flow will make ganging impositions for our large format digital printers and I need to somehow let the printer operators know which jobs are in which file. Phoenix is currently set up to use the name of the first file for the imposed file, but as it's a ganging imposition there may be several orders in one imposition.
When exporting the imposed pdf from Phoenix I can either create a report (PDF) or an xml file like the one I posted.
My first idea was to print the report on paper and give it to the printer operator together with the work order(s), but the large format print manager don't think it will work very well, so he would like to have all order numbers in the file name instead. That way they can find the job by just searching for the order number. The biggest problem with this of course is if you have many different orders on one sheet, but he figures we will have no more than two or three orders on the same sheet.
If I can filter the xml file to only include only the different order numbers I might be able to use that xml to send one or more e-mail to our planning tool and include the information as a note in the planned job. I.e. this job is in the file named XXXXXXXXX.pdf. That way it would be easy for the print operator to find which file(s) to print.
Worst case the prepress operators will have to sort the incoming jobs by stock and cutting method and then use the reports to add this information in the planning tool.
Enfocus Switch, Enfocus PitStop Server, Enfocus PDF Review, HP SmartStream& Kodak Prinergy with RBA
Offset 72x102, Offset Large Format, Digital Large Format and Digital print.
LasseThid
Advanced member
Posts: 353
Joined: Tue Mar 03, 2015 2:30 pm
Location: Molndal, Sweden

Re: Extracting data from xml

Post by LasseThid »

I've tried to create an XSLT file using copy/paste and google, but when I put this file in an online transformer along with the xml I get an error saying "The element type "xsl:template" must be terminated by the matching end-tag "</xsl:template>"."
As far as I can see it's terminated correctly, but on the other hand I would understand almost as much if I tried to read something in mandarine... :lol:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
	xmlns:xs="http://www.w3.org/2001/XMLSchema"
        exclude-result-prefixes="xs"
        version="1.0">
	<xsl:output method="text"/>

	<xsl:template match="/">
		<xsl:value-of select="//job/id"/>
		<xsl:text>;</xsl:text>
		<xsl:for-each select="product"/>
			<xsl:value-of select="//products/product/name"/>
			<xsl:text>;</xsl:text>
		</xsl:for-each>
	</xsl:template>
</xsl:stylesheet>
Enfocus Switch, Enfocus PitStop Server, Enfocus PDF Review, HP SmartStream& Kodak Prinergy with RBA
Offset 72x102, Offset Large Format, Digital Large Format and Digital print.
cstevens
Member
Posts: 103
Joined: Tue Feb 12, 2013 8:42 pm

Re: Extracting data from xml

Post by cstevens »

You've got an extra closing tag at the end of your for-each opening tag:

Code: Select all

<xsl:for-each select="product"/>
should be:

Code: Select all

<xsl:for-each select="product">
If you get rid of that it should be valid.
Post Reply