read text from *.txt-files and save it to variables

Post Reply
bkromer
Member
Posts: 99
Joined: Thu Jul 11, 2019 10:41 am

read text from *.txt-files and save it to variables

Post by bkromer »

Hello there,

I convert the incoming pdf with callasToolbox to *.txt-files. In the next step, I want to extract the text from these files and save it to a variable, so that I could have a regex search ran on this and find specific Numbers that fit my patterns.
Image

My code so far:
const fs = require('fs')
fs.readFile('Input.txt', 'utf-8', (err, data) => {
if (err) throw err;
console.log(data);

My Question is in wich variable or such do I need to save the data in, to use it afterwards?

Thanks in Advance,
Ben
Benjamin
jan_suhr
Advanced member
Posts: 586
Joined: Fri Nov 04, 2011 1:12 pm
Location: Nyköping, Sweden

Re: read text from *.txt-files and save it to variables

Post by jan_suhr »

There is an app that fix that for you.

https://www.enfocus.com/en/appstore/pro ... t-from-pdf
Jan Suhr
Color Consult AB
Sweden
=============
Check out my apps
bkromer
Member
Posts: 99
Joined: Thu Jul 11, 2019 10:41 am

Re: read text from *.txt-files and save it to variables

Post by bkromer »

Okay, that's Plan B 8-) but first I wanna try doing it on my own. Does someone have any hints for me, Code snippets etc on how to do this?
Benjamin
Padawan
Advanced member
Posts: 358
Joined: Mon Jun 12, 2017 8:48 pm
Location: Belgium
Contact:

Re: read text from *.txt-files and save it to variables

Post by Padawan »

Your sample looks very node.js-ish. Switch javascript looks a bit different.

I always start from the samples in the documentation:
https://www.enfocus.com/manuals/Develop ... class.html

The Static Text I/O functions are the easiest
bkromer
Member
Posts: 99
Joined: Thu Jul 11, 2019 10:41 am

Re: read text from *.txt-files and save it to variables

Post by bkromer »

Thanks for the link.
Buuuut I still dont get it.

In my flow i want to rename my PDF-File with specific Metadata Fields. I got the Email.Body(Text) Object and i want to pick a specific part of this string with regex and rename the pdf file like so.
The thing i dont understand is how to pass this string.
Imagehttps://imgur.com/S0RP0hC

I tried Imagehttps://imgur.com/KnsVqXI

Thanks in Advance
lg Ben
Benjamin
Padawan
Advanced member
Posts: 358
Joined: Mon Jun 12, 2017 8:48 pm
Location: Belgium
Contact:

Re: read text from *.txt-files and save it to variables

Post by Padawan »

Ok, some things are off:
You are using a script expression, so you don't need the job arrived entry point. You can just write your code, store the output in a variable and in the last line type the variable name with a semicolon.


Job.getvariableasstring() expects a switch variable as string, so that's not going to work

It's better to just end with
re[1];


Regular expressions in Switch can behave differently then regular expressions in node.js, it is useful to check the enfocus documentation on what they support. For example, something only the g and i flags are supported, m is not
https://www.enfocus.com/manuals/Develop ... egexp.html
.

There is not a true debugger with breakpoints, but you can use job.log() and s.log() like you would use console.log() in other environments. Difference is that logs don't go to console, but to switch messages. There are different types of logs, which makes the syntax differently
job.log(-1, "this is a debug message") ;
job.log(1, "this is a normal message") ;
job.log(3, "this is an error message") ;
bkromer
Member
Posts: 99
Joined: Thu Jul 11, 2019 10:41 am

Re: read text from *.txt-files and save it to variables

Post by bkromer »

Hello everyone,
i have found a solution for my task.

Code: Select all

function jobArrived( s : Switch, job : Job )
{
//READ IN PDF-TEXT 
var pdf =  File.read( job.getPath(), "UTF-8" ); // You can read in the txt file like this. The Variable "pdf" will contain all the text from the file.
	
// READ IN ROWS
var csvRows = pdf.split( /\n/ );  // When you split the text by newlines u get every row of the file.
		
//REGEX & SEARCH
var rechnungNrRE = /RECHNUNG\sNR\.\s([\d+]+)/g;
rechnungNrRE.search( pdf );
var rechnungNr = rechnungNrRE.cap(1);

job.log(1, "Rechnungsnummer:  "+rechnungNr ); // log the value of the variable for debugging purposes.
job.setPrivateData("pdRechnungNr",rechnungNr); // Set a PrivateData Variable with this value.


}
lg
ben
Benjamin
Post Reply