“ERROR: … failed to process… ” in NiFi

I was greeted by a few cryptic things in NiFi this morning during my morning check-in.

A PutSQL processor was reporting an error:

"ERROR: PutSQL[id=$UUID>]failed to process due to java.lang.IndexOutOfBoundsException: Index: 1, Size: 1; rolling back session: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1"

There were no recent errors counted in the LogAttribute counter we set up to record errors;
The Tasks/Time count in the PutSQL processor was though the roof, despite the errors and lack of successes.

Needless to say, the processor was all bound up and a number of tasks were queued. Not a good start to my day.

I checked the data provenance and didn’t see anything remarkable about the backed-up data. The error message suggests (to me) that the first statement parameter is at fault, and that parameter happened to be a date (which has been problematic for me in NiFi with a MySQL backend). Neither that value, nor the rest of the values, were remarkable or illegal for the fields they’re going into.

It wasn’t until I spent some time looking over the source data that I saw the problem: there is a duplicate key in the data. This error is NiFi’s way of complaining about it.

In our case the underlying table doesn’t have good keys, or a good structure in general, and I’m planning to replace it soon anyway, but updating the primary keys to allow the duplicate data (because it IS valid data, despite the table design) has solved the issue.

Author: H Walker Jones, Esq

A professional programmer with a sordid past involving sysadmin, tech support, and cooking. View all posts by H Walker Jones, Esq

4 thoughts on ““ERROR: … failed to process… ” in NiFi”

Nick says:
July 22, 2016 at 12:06 pm
Hello there,
What did you eventually do with the above error. I am having the same error and have been scratching my head to figure out what’s wrong..It is also random at times….
Any insights?
Many thanks!
Loading...
Reply
1. Dad says:
  July 22, 2016 at 12:17 pm
  In my case it was a duplicate key in the database, but it’s possible that other database errors would give the same error. I had the opportunity to change the database keys rather than changing the data itself, but my situation was special.
  My next step, if I hadn’t seen the problem yet, would have been writing a test harness to loop over the original raw data and try inserting it into a copy of the production database to see if any errors crop up.
  The ConvertJSONToSQL and PutSQL processors aren’t very helpful with their errors messages or the logging they provide. PutSQL in particular will show you the parameterized statement it ran, but not the values for the parameters which makes it difficult to figure out which value(s) might be causing a problem on long runs.
  Loading...
  Reply
  1. Raghav Ramakrishnan says:
    August 24, 2016 at 3:14 pm
    I was facing the same issue here. I am pulling data from RSS feeds and I guess the EvaluateXquery is producing duplicate flowfiles and the mergeContent is merging them. Basically, the attributes are being duplicated like I am getting multiple titles, descriptions and pubDates of the same item. I tried having the title as my primary key in the MySQL db to avoid getting in the duplicate data, but the error you mentioned above popped up. If i change the primary key, i am able to get the duplicate data in the table. I hope there was an INSERT IGNORE or a REPLACE DDL command support in the PutSQL processor and it woudl solve this problem. Because, if i store the merged file in my disk using PutFile, I have the option of REPLACE which does the trick. Do you know if there is anything I can do to not create these duplicate data inside the dataflow itself or do I do something different in the DB?
    Loading...
    Reply
    1. Dad says:
      August 27, 2016 at 6:34 pm
      Do you know if there is anything I can do to not create these duplicate data inside the dataflow itself or do I do something different in the DB?
      It sounds like your database structure isn’t a good match to the incoming data.
      If you can’t restructure the database or redefine the data, you could create a custom processor (or more likely, a script that executes with the ExecuteStreamCommand processor) to check the data against the database and reroute it with the routing processor as needed.
      Loading...
      Reply

“ERROR: … failed to process… ” in NiFi

Like this:

Related

Author: H Walker Jones, Esq

4 thoughts on ““ERROR: … failed to process… ” in NiFi”

Leave a ReplyCancel reply