Tuesday, March 27, 2012

The Pervasive Data Integration Server is a Java server app which by default uses only 64mb of memory. Increasing that value should resolve the 'Java Heap Space' errors that we have been getting from PDI. To implement this setting one must follow these steps:

  1. Locate the PDI installation folder and within that, a folder named "runtime"
  2. Locate a file within the "runtime" folder named "cosmos.ini"
  3. Open the "cosmos.ini" file in a text editor and locate a section named [UserInfo]
  4. Change the default memory allocation by adding a line under the [UserInfo] section as "JvmArgs=-Xmx256m" (or some other valid new memory value
[UPDATE 5/18/12, the original tech provided us the wrong information. The correct information is]:
  1. Locate the PDI installation folder and within that folder is a folder named "etc" which you need to navigate to.
  2. Locate a text file named "com.pervasive.execution.worker.cfg" and open it with a text editor. There you should find the following line: "#engine.java.options="
  3. Remove the hash (#) and add your own -Xmx JVM arg there.
Note: This will affect all your worker processes, so take care because if you have a host with 8GB of RAM and you set your -Xmx setting to 4GB each job you run will use the 4GB.

Implementing this setting may not be effective for your situation. After trying this fix we have found that we are still plagued by JVM memory leaks our main PDI project.

More general info about increasing memory for Java using Xmx parameter can be found online.

Thursday, March 22, 2012

Pervasive Data Integrator 10 Review

What can I say about PDI. So far it has been a complete disappointment. Pervasive has certainly made a huge step forward in embracing a web-based solution in a RIA design using Adobe Flash and all the bells and whistles it has to offer. This comes at the expense of having an application that has significant bugs in Chrome and Internet Explorer. When we received our training the instructor not only faithfully stuck to using Firefox, but whenever we experienced significant bugs in the UI, his retort was simply that he was not experiencing any while using Firefox (hint, hint, use Firefox OK?).

I found that the product was shipped with incomplete, inaccurate or out of date documentation. For example, their documentation states that when executing shell commands from EZ Script that you simply use the Shell() function (as documented) however "If you run a .bat file, you cannot use the following syntax: Shell("file.bat"). Instead, use the following: Shell("cmd /c file.bat")". In fact the documentation completely leaves out the fact that even if you are not running a .bat file, you are actually still required to use "cmd /c" if you are using special characters such as the "|" (pipe) to redirect stdout. That one cost me a few hours till I figured out on my own that, contrary to their documentation "cmd /c" was probably needed in other situations also. A follow up from Pervasive's tech support confirmed this. This may seem minor however, when you're complete noob to PDI 10 the documentation and it's accuracy is all you have to rely on. Small issues can quickly become time burners.

Then we started building our projects and ran into some irritating bugs. For example, in the macro administration screen, when you define a macro that uses two or more single quote for it's value, you will notice that when you save the macro, it will automatically delete half your single quote. Click save again and it will do the same thing, until you are left with just one solitary single quote. Obviously the internal mechanics which save the macro value, is not scrubbing for single quotes which are used in SQL to delimit text. Tech support told us it was because we were running the server as an administrator which is obviously not the issue. We were not running it with an admin account.

Schemas are not not properly processed.  We have a schema that has an element's cardinality set as 0 to unbounded. When that node has no data in XML returned it deserializes just fine using .Net WCF but when PDI encounters the same XML it complains that it cannot find the any instances of the element defined in the schema as having a minOccurance of 0 - well duh! [UPDATE: This bug was fixed]

Then there are design flaws. Let's talk about job control. Going to the Integration Manager you can see how many jobs are queued. Unfortunately you it won't tell you what jobs are queued. It will tell you there are jobs running but can't tell you accurately what jobs. Forget about having the ability to stop running jobs or removing them from the queue. Server Utilization in one place will show 100% while at the same time in another place it could show 65%. Manually, run and then stop a job (if you are smart enough not to hose yourself by closing the browser or clicking away to another tab in the app [UPDATE: This bug was fixed]) from the Integration Manager and it will erroneously report that the job failed.

So imagine I'm in a situation where an exec wants to see a specific job run. I have to suspend all scheduled jobs, shut down the PDI server service and start it back up. Then go to an Integration job and click the run button only to notice that PDI is currently reporting that there are "3 Running" jobs right now. (a) I suspended all job and rebooted the service (b) we only have 1 execution engine in our license -- how can there be "3 Running" jobs ??  The fact is I can't stop what ever these 3 jobs are and run the process the exec wants run right now. Suspending all processes and restarting the server service does nothing to stop these 3 jobs from running. Undeterred I repeat the process. Shut down the service and start it back up again, only now it reports there are 4 jobs running. I still don't believe the dashboard. I run my process knowing that since there's only 1 exec engine license, my process will run if in fact there aren't any other processes running - and behold that is the case, my process ran because in fact there are no other jobs running. The report of "4 Running" job is erroneous. [UPDATE: This bug was fixed somewhere in the subsequent releases we received]

Once we got through all the gotchas and traps and finally had a working project we found that we could only execute one job at a time. One would think that for that ridiculous price tag, that our license would actually allow us to run more than one process at a time. No. Before agreeing to a price with Pervasive, *make absolutely sure* that the person that ordered the product with the license knew about, and made sure that it allows for as many execution engines as you have need for. Pervasive sales failed to mention that very important fact to us.

That being said we can now execute individual jobs one at a time, provided that the system does not hang and have to be restarted as we frequently experience in our case.

Before we paid one penny for the product our PM asked the salesperson if we could get a demo version that we could install and try out. It's always a good idea to test drive a product before dropping the dough. Pervasive never provided it, and it's obvious why, I would have failed the evaluation and we would not be where we are now. Unfortunately things did not turn out that way.

We're still in the process of turning this rotten turnip into something usable. Wish us luck. We're going to need it.

[UPDATE: After 3 months of waiting on memory leaks to get fixed, we sort of have a new PDI release (version 10.2.4.47) that is usable (albeit unstable) for what we need it to do. I would like to give a special thanks to Jason from Pervasive for seemingly being the only person to take personal ownership of the memory leak issues I reported, many moons ago. My only gripe is that 3 months is too long to wait for a bug fix when you're dead in the water - especially since they charged us $45,000 for this product. For the record, I was against the purchase of this product in the first place. There are free open source tools on the market that work much better than Pervasive DI and they come with really good community support. And since they're open source and if you're a programmer like I am, you can just fix any glitches yourself. Unfortunately there are some people who think that the more something costs, the better it is which is part reason to why we ended up with this expensive hunk-of-junk. i.e. this is what happens when "the powers that be" are charmed by sales people and totally ignore the justified technical objections of their architect to such a deeply flawed product]

Wednesday, March 7, 2012

Pervasive DI 10 "Shell" function with redirection characters

Although the Pervasive DI 10 documentation states that using the "cmd /c" directive is only necessary when executing a .bat file, it should be noted that when using special characters in a command, one must also use the "cmd /c" directive. Even when calling EXEs.

EZ Script example:

Wrong:

Shell("ipconfig > myIpAddress.txt")


Right

Shell("cmd /c ipconfig > myIpAddress.txt")


Apparently, this is because the Shell function (and the Application step) are subject to unexpected behavior when dealing with redirection characters such as | (pipe), < or >

An example of code that uses 2 of these characters:

Shell("cmd /c fciv.exe data.xml | find "."xml"" > data.xml.md5")