Thursday, March 22, 2012

Pervasive Data Integrator 10 Review

What can I say about PDI. So far it has been a complete disappointment. Pervasive has certainly made a huge step forward in embracing a web-based solution in a RIA design using Adobe Flash and all the bells and whistles it has to offer. This comes at the expense of having an application that has significant bugs in Chrome and Internet Explorer. When we received our training the instructor not only faithfully stuck to using Firefox, but whenever we experienced significant bugs in the UI, his retort was simply that he was not experiencing any while using Firefox (hint, hint, use Firefox OK?).

I found that the product was shipped with incomplete, inaccurate or out of date documentation. For example, their documentation states that when executing shell commands from EZ Script that you simply use the Shell() function (as documented) however "If you run a .bat file, you cannot use the following syntax: Shell("file.bat"). Instead, use the following: Shell("cmd /c file.bat")". In fact the documentation completely leaves out the fact that even if you are not running a .bat file, you are actually still required to use "cmd /c" if you are using special characters such as the "|" (pipe) to redirect stdout. That one cost me a few hours till I figured out on my own that, contrary to their documentation "cmd /c" was probably needed in other situations also. A follow up from Pervasive's tech support confirmed this. This may seem minor however, when you're complete noob to PDI 10 the documentation and it's accuracy is all you have to rely on. Small issues can quickly become time burners.

Then we started building our projects and ran into some irritating bugs. For example, in the macro administration screen, when you define a macro that uses two or more single quote for it's value, you will notice that when you save the macro, it will automatically delete half your single quote. Click save again and it will do the same thing, until you are left with just one solitary single quote. Obviously the internal mechanics which save the macro value, is not scrubbing for single quotes which are used in SQL to delimit text. Tech support told us it was because we were running the server as an administrator which is obviously not the issue. We were not running it with an admin account.

Schemas are not not properly processed.  We have a schema that has an element's cardinality set as 0 to unbounded. When that node has no data in XML returned it deserializes just fine using .Net WCF but when PDI encounters the same XML it complains that it cannot find the any instances of the element defined in the schema as having a minOccurance of 0 - well duh! [UPDATE: This bug was fixed]

Then there are design flaws. Let's talk about job control. Going to the Integration Manager you can see how many jobs are queued. Unfortunately you it won't tell you what jobs are queued. It will tell you there are jobs running but can't tell you accurately what jobs. Forget about having the ability to stop running jobs or removing them from the queue. Server Utilization in one place will show 100% while at the same time in another place it could show 65%. Manually, run and then stop a job (if you are smart enough not to hose yourself by closing the browser or clicking away to another tab in the app [UPDATE: This bug was fixed]) from the Integration Manager and it will erroneously report that the job failed.

So imagine I'm in a situation where an exec wants to see a specific job run. I have to suspend all scheduled jobs, shut down the PDI server service and start it back up. Then go to an Integration job and click the run button only to notice that PDI is currently reporting that there are "3 Running" jobs right now. (a) I suspended all job and rebooted the service (b) we only have 1 execution engine in our license -- how can there be "3 Running" jobs ??  The fact is I can't stop what ever these 3 jobs are and run the process the exec wants run right now. Suspending all processes and restarting the server service does nothing to stop these 3 jobs from running. Undeterred I repeat the process. Shut down the service and start it back up again, only now it reports there are 4 jobs running. I still don't believe the dashboard. I run my process knowing that since there's only 1 exec engine license, my process will run if in fact there aren't any other processes running - and behold that is the case, my process ran because in fact there are no other jobs running. The report of "4 Running" job is erroneous. [UPDATE: This bug was fixed somewhere in the subsequent releases we received]

Once we got through all the gotchas and traps and finally had a working project we found that we could only execute one job at a time. One would think that for that ridiculous price tag, that our license would actually allow us to run more than one process at a time. No. Before agreeing to a price with Pervasive, *make absolutely sure* that the person that ordered the product with the license knew about, and made sure that it allows for as many execution engines as you have need for. Pervasive sales failed to mention that very important fact to us.

That being said we can now execute individual jobs one at a time, provided that the system does not hang and have to be restarted as we frequently experience in our case.

Before we paid one penny for the product our PM asked the salesperson if we could get a demo version that we could install and try out. It's always a good idea to test drive a product before dropping the dough. Pervasive never provided it, and it's obvious why, I would have failed the evaluation and we would not be where we are now. Unfortunately things did not turn out that way.

We're still in the process of turning this rotten turnip into something usable. Wish us luck. We're going to need it.

[UPDATE: After 3 months of waiting on memory leaks to get fixed, we sort of have a new PDI release (version that is usable (albeit unstable) for what we need it to do. I would like to give a special thanks to Jason from Pervasive for seemingly being the only person to take personal ownership of the memory leak issues I reported, many moons ago. My only gripe is that 3 months is too long to wait for a bug fix when you're dead in the water - especially since they charged us $45,000 for this product. For the record, I was against the purchase of this product in the first place. There are free open source tools on the market that work much better than Pervasive DI and they come with really good community support. And since they're open source and if you're a programmer like I am, you can just fix any glitches yourself. Unfortunately there are some people who think that the more something costs, the better it is which is part reason to why we ended up with this expensive hunk-of-junk. i.e. this is what happens when "the powers that be" are charmed by sales people and totally ignore the justified technical objections of their architect to such a deeply flawed product]


  1. For the record, 8 months down the road, I moved on from the company that implemented PDI 10, and the system still has various minor issues and one major issue, which I reported to them at the beginning of the year. It seems that somewhere along the line, they closed my unanswered ticket for no apparent reason, so I had to prepare a very detailed PDF document reporting the issue and open a new ticket.

    Every week or so, the job execution engine hangs and no jobs will process. So basically, while I was still employed there, I had to baby-sit the system to make sure that it wasn't stuck and in need of fixing. Not really an ideal job for a systems architect who warned against using this product in the first place, but there you have it.

    In a phone call with Pervasive, I was told that the execution engine was so badly broken, they had to rewrite it's code from scratch. The email reply I received from Pervasive on the matter, toned this fact down quite a bit and simply stated that the current code "will no longer use this method to create engines". Here's the snippet from that email:

    " I had a few moments earlier to speak with an engineer about the main issue you are having of jobs no longer running after a variable amount of time (typically a few days), and he believes this could be due to a known issue in the underlying structure of how the engine is being called from the stack through the Integration Server SDK. The best news is that 10.2.5, when it’s released, will no longer use this method to create engines, so this issue should completely dissolve. We are also hopeful that the issue will not arise in the latest build of 10.2.4 that I will be sending you in the next couple of days. However, if the issue does arise, I would ask you to do the following to help us quickly identify the issue and get it patched in 10.2.4:

    1) Before clearing the queue or restarting the stack, navigate to your Pervasive installation directory and zip the “work” folder. It can be found at: “…..Pervasive\di-full-64bit-10.2.4-xx\data\di\execution\di9”. Please send me this folder zipped as it contains useful logs to determine where the hang occurred.
    2) To get you up and running: Clear the queue (if you have installed the new build). Then, you should only need to delete the java.exe for the engine that hung (should be smallest java.exe). As I know this can be difficult for you since you need to go through an IT department, you could elect to still have them restart the stack (they’ll still need to kill any remaining java.exe’s) or perform a machine reboot, but ideally, you should only need to kill the hung java.exe.

    I will let you know as soon as the new build becomes available and send you a download link. Please let me know any concerns, questions, issues that you have or that occur until then."

    AFAIK, the company is still waiting for that fix. As I said back in March, good luck people. You're going to need it :-)

    1. Read the final follow-up commentary here:

  2. Final update. Months after I moved on to another company and almost a year after my previous employer got into the Pervasive Data Integrator mess I warned them to stay out of, they finally realized it was a huge money pit and scrapped the entire implementation. This even after apparently paying Pervasive thousands more for onsite consulting to get try to get the implementation up to par.

    If the executives would have listened to the opinions of their resident expert (me) instead of trusting the obviously biased sweet talk of Pervasive's sales people, they would have saved a lot of time; money and hassle. I hope this review saves anyone else who is considering using PDI 10, from going down the same pointless and expensive rabbit hole my previous employer fell into.