Extracting data from xml

Post Reply
LasseThid
Advanced member
Posts: 365
Joined: Tue Mar 03, 2015 2:30 pm
Location: Molndal, Sweden

Extracting data from xml

Post by LasseThid »

I'm working on a flow that creates impositions using Phoenix.
As part of the output I can get an xml file. The xml file has information on all products that is part of this particular imposition.
Is it possible to somehow filter the xml file and only keep the job id (P107586_500.PDF in this case) and the names of all products?
The number of products in the xml will vary with every imposition made.

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<job>
   <id>P107586_500.PDF</id>
   <name></name>
   <contact></contact>
   <phone></phone>
   <client></client>
   <notes></notes>
   <default-bleed>5mm</default-bleed>
   <units>mm</units>
   <run-length>53</run-length>
   <press-minutes>47</press-minutes>
   <plate-cost>0.0</plate-cost>
   <stock-cost>4849.6650390625</stock-cost>
   <press-cost>1175.0</press-cost>
   <die-cost>0.0</die-cost>
   <total-cost>6024.6650390625</total-cost>
   <waste>0.34245550632476807</waste>
   <sheet-usage>0.6751038432121277</sheet-usage>
   <underrun>0.0</underrun>
   <overrun>0.044802866876125336</overrun>
   <layout-count>1</layout-count>
   <layouts>
      <layout>
         <index>1</index>
         <name>Layout 1</name>
         <workstyle>FlatWork</workstyle>
         <run-length>53</run-length>
         <press-minutes>47</press-minutes>
         <plates>4</plates>
         <plate-cost>0.0</plate-cost>
         <stock-cost>4849.6650390625</stock-cost>
         <press-cost>1175.0</press-cost>
         <die-cost>0.0</die-cost>
         <total-cost>6024.6650390625</total-cost>
         <waste>0.34245550632476807</waste>
         <sheet-usage>0.6751038432121277</sheet-usage>
         <underrun>0.0</underrun>
         <overrun>0.044802866876125336</overrun>
         <product-count>3</product-count>
         <tool-stats>
            <categories>
               <category>
                  <name>Cut</name>
                  <length>17756.4604mm</length>
               </category>
               <category>
                  <name>Crease</name>
                  <length>0mm</length>
               </category>
            </categories>
         </tool-stats>
         <priority-stats>
            <priority-stat>
               <priority>1</priority>
               <sheet-usage>0.67510384</sheet-usage>
            </priority-stat>
         </priority-stats>
         <surfaces>
            <surface>
               <side>Front</side>
               <press>
                  <name>Plano Skärbord</name>
                  <id>9c8883a0-5391-4fd9-85fc-81f4de7e976d</id>
               </press>
               <stock>
                  <name>Kanalplast</name>
                  <id>38fba052-222f-44bc-b3e1-38c800a8168c</id>
               </stock>
               <grade>
                  <name>1780 gsm</name>
                  <id>f08a4792-791a-4736-aec5-bbc3aae2c5b4</id>
               </grade>
               <sheet>
                  <name>1600 x 1780mm Long</name>
                  <id>08e98c5f-a17c-42e0-a244-b629e06b975e</id>
                  <width>1600mm</width>
                  <height>1780mm</height>
               </sheet>
               <inks>
                  <ink>
                     <name>Black</name>
                     <separation>false</separation>
                     <type>normal</type>
                  </ink>
                  <ink>
                     <name>Cyan</name>
                     <separation>false</separation>
                     <type>normal</type>
                  </ink>
                  <ink>
                     <name>Magenta</name>
                     <separation>false</separation>
                     <type>normal</type>
                  </ink>
                  <ink>
                     <name>Yellow</name>
                     <separation>false</separation>
                     <type>normal</type>
                  </ink>
                  <ink>
                     <name>Cut</name>
                     <separation>true</separation>
                     <type>cut</type>
                  </ink>
               </inks>
            </surface>
         </surfaces>
      </layout>
   </layouts>
   <products>
      <product>
         <name>P107586_400.PDF</name>
         <index>1</index>
         <ordered>150</ordered>
         <die-source>_11F7C_P107586_400.PDF</die-source>
         <die-path>C:/Users/Admin/AppData/Local/Temp/phoenix-c9ce838b-cd34-4f72-8621-f623c3e516db/P107586_500PDF-4116310353162221012/upload/1/_11F7C_P107586_400.PDF</die-path>
         <stock>Kanalplast</stock>
         <grade>1780 gsm</grade>
         <grain>None</grain>
         <width>399.5282mm</width>
         <height>463.6602mm</height>
         <spacing-type>Uniform</spacing-type>
         <group>Kanalplast_1780 gsm</group>
         <priority>1</priority>
         <rotation>Orthogonal</rotation>
         <templates/>
         <placed>3</placed>
         <total>159</total>
         <overrun>9</overrun>
         <layouts>
            <layout index="1" placed="3"/>
         </layouts>
      </product>
      <product>
         <name>P107587_300.PDF</name>
         <index>2</index>
         <ordered>39</ordered>
         <die-source>_11F7E_P107586_300.PDF</die-source>
         <die-path>C:/Users/Admin/AppData/Local/Temp/phoenix-c9ce838b-cd34-4f72-8621-f623c3e516db/P107586_500PDF-4116310353162221012/upload/2/_11F7E_P107586_300.PDF</die-path>
         <stock>Kanalplast</stock>
         <grade>1780 gsm</grade>
         <grain>None</grain>
         <width>299.4127mm</width>
         <height>347.5097mm</height>
         <spacing-type>Uniform</spacing-type>
         <group>Kanalplast_1780 gsm</group>
         <priority>1</priority>
         <rotation>Orthogonal</rotation>
         <templates/>
         <placed>1</placed>
         <total>53</total>
         <overrun>14</overrun>
         <layouts>
            <layout index="1" placed="1"/>
         </layouts>
      </product>
      <product>
         <name>P107588s_500.PDF</name>
         <index>3</index>
         <ordered>369</ordered>
         <die-source>_11F7A_P107586_500.PDF</die-source>
         <die-path>C:/Users/Admin/AppData/Local/Temp/phoenix-c9ce838b-cd34-4f72-8621-f623c3e516db/P107586_500PDF-4116310353162221012/upload/3/_11F7A_P107586_500.PDF</die-path>
         <stock>Kanalplast</stock>
         <grade>1780 gsm</grade>
         <grain>None</grain>
         <width>499.413mm</width>
         <height>579.5782mm</height>
         <spacing-type>Uniform</spacing-type>
         <group>Kanalplast_1780 gsm</group>
         <priority>1</priority>
         <rotation>Orthogonal</rotation>
         <templates/>
         <placed>7</placed>
         <total>371</total>
         <overrun>2</overrun>
         <layouts>
            <layout index="1" placed="7"/>
         </layouts>
      </product>
   </products>
</job>
Thanks
Lasse
Enfocus Switch, Enfocus PitStop Server, Enfocus PDF Review, HP SmartStream& Kodak Prinergy with RBA
Offset 72x102, Offset Large Format, Digital Large Format and Digital print.
cstevens
Member
Posts: 106
Joined: Tue Feb 12, 2013 8:42 pm

Re: Extracting data from xml

Post by cstevens »

How do you want the information saved? Do you want a new XML file with just that information in it, or are you trying to pull it into a script for use elsewhere?

If you want a new XML file then an xsl transform is probably the easiest option.

If the later then you can do something like this:

Code: Select all

	var xmlDoc = new Document(job.getPath());
	var jobId = xmlDoc.evalToString("//job/id");
	s.log(1, "JobID is: " + jobId);
	var productNames = xmlDoc.evalToNodes("//job/products/product/name");
	for(var i=0; i<productNames.length; i++){
		s.log(1, "ProductName " + i + " is: " + productNames.at(i).evalToString("./text()"));
	}
cstevens
Member
Posts: 106
Joined: Tue Feb 12, 2013 8:42 pm

Re: Extracting data from xml

Post by cstevens »

It would probably be easier to add product name values to an array like this:

Code: Select all

	var xmlDoc = new Document(job.getPath());
	var jobId = xmlDoc.evalToString("//job/id");
	s.log(1, "JobID is: " + jobId);
	var productNames = xmlDoc.evalToNodes("//job/products/product/name");
	var namesArray = [];
	for(var i=0; i<productNames.length; i++){
		//s.log(1, "ProductName " + i + " is: " + productNames.at(i).evalToString("./text()"));
		namesArray.push(productNames.at(i).evalToString("./text()"));
	}
	s.log(1, "List of ProductNames is: " + namesArray.toString());
sander
Advanced member
Posts: 311
Joined: Wed Oct 01, 2014 8:58 am
Location: Den Bosch

Re: Extracting data from xml

Post by sander »

cstevens wrote:If you want a new XML file then an xsl transform is probably the easiest option.
If you might need this, Lars, it’s already done here: viewtopic.php?f=12&t=1201&p=4080&hilit=Xslt+split#p4080

I’m using the Saxon configurator now to do the same. If you want I’ll post next monday.
LasseThid
Advanced member
Posts: 365
Joined: Tue Mar 03, 2015 2:30 pm
Location: Molndal, Sweden

Re: Extracting data from xml

Post by LasseThid »

The flow will make ganging impositions for our large format digital printers and I need to somehow let the printer operators know which jobs are in which file. Phoenix is currently set up to use the name of the first file for the imposed file, but as it's a ganging imposition there may be several orders in one imposition.
When exporting the imposed pdf from Phoenix I can either create a report (PDF) or an xml file like the one I posted.
My first idea was to print the report on paper and give it to the printer operator together with the work order(s), but the large format print manager don't think it will work very well, so he would like to have all order numbers in the file name instead. That way they can find the job by just searching for the order number. The biggest problem with this of course is if you have many different orders on one sheet, but he figures we will have no more than two or three orders on the same sheet.
If I can filter the xml file to only include only the different order numbers I might be able to use that xml to send one or more e-mail to our planning tool and include the information as a note in the planned job. I.e. this job is in the file named XXXXXXXXX.pdf. That way it would be easy for the print operator to find which file(s) to print.
Worst case the prepress operators will have to sort the incoming jobs by stock and cutting method and then use the reports to add this information in the planning tool.
Enfocus Switch, Enfocus PitStop Server, Enfocus PDF Review, HP SmartStream& Kodak Prinergy with RBA
Offset 72x102, Offset Large Format, Digital Large Format and Digital print.
LasseThid
Advanced member
Posts: 365
Joined: Tue Mar 03, 2015 2:30 pm
Location: Molndal, Sweden

Re: Extracting data from xml

Post by LasseThid »

I've tried to create an XSLT file using copy/paste and google, but when I put this file in an online transformer along with the xml I get an error saying "The element type "xsl:template" must be terminated by the matching end-tag "</xsl:template>"."
As far as I can see it's terminated correctly, but on the other hand I would understand almost as much if I tried to read something in mandarine... :lol:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
	xmlns:xs="http://www.w3.org/2001/XMLSchema"
        exclude-result-prefixes="xs"
        version="1.0">
	<xsl:output method="text"/>

	<xsl:template match="/">
		<xsl:value-of select="//job/id"/>
		<xsl:text>;</xsl:text>
		<xsl:for-each select="product"/>
			<xsl:value-of select="//products/product/name"/>
			<xsl:text>;</xsl:text>
		</xsl:for-each>
	</xsl:template>
</xsl:stylesheet>
Enfocus Switch, Enfocus PitStop Server, Enfocus PDF Review, HP SmartStream& Kodak Prinergy with RBA
Offset 72x102, Offset Large Format, Digital Large Format and Digital print.
cstevens
Member
Posts: 106
Joined: Tue Feb 12, 2013 8:42 pm

Re: Extracting data from xml

Post by cstevens »

You've got an extra closing tag at the end of your for-each opening tag:

Code: Select all

<xsl:for-each select="product"/>
should be:

Code: Select all

<xsl:for-each select="product">
If you get rid of that it should be valid.
Davidjohn321
Newbie
Posts: 1
Joined: Thu Nov 13, 2025 8:54 pm
Location: Dammam

Re: Extracting data from xml

Post by Davidjohn321 »

It usually means there’s either a missing closing tag somewhere before the <xsl:template> ends, or an extra character that breaks the structure. XSLT is really strict one unclosed <xsl:if>, <xsl:for-each>, or even a stray > can cause the parser to think the </xsl:template> never appears.

A good way to debug it is to paste the XSLT into an XML-aware editor (like VS Code or Notepad++ with XML tools). They’ll highlight which tag isn’t balanced. Sometimes indentation alone helps reveal the missing closure.

Also keep an eye out for namespaces if something is written as <xsl : template> (with a space or typo), it won’t match the closing tag even if it looks correct.

It’s kind of like dealing with an isolation transformer in real electrical setups one tiny wiring mistake and the whole thing stops behaving as expected.

If you want, feel free to paste your XSLT here and we can point out the exact tag causing the issue.
NEOSA
Member
Posts: 53
Joined: Thu Mar 10, 2016 6:31 pm

Re: Extracting data from xml

Post by NEOSA »

Hi LasseThid,

May be you can also try this App :

https://www0.enfocus.com/en/appstore/pr ... -text-file

Just add a XML PickUp before using this App, and create content as XML with desired values. You don't need in a such way to manipulate XSLT Transforms.

If you need a more efficient way, this App can do the job :

https://www0.enfocus.com/en/appstore/pr ... ta-toolkit
jan_suhr
Advanced member
Posts: 696
Joined: Fri Nov 04, 2011 1:12 pm
Location: Nyköping, Sweden

Re: Extracting data from xml

Post by jan_suhr »

This is an 8 year old thread that an AI-bot replied in.
Jan Suhr
Color Consult AB
Sweden
=============
Check out my apps
Post Reply