HDInsight error with external metastore (Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient)

Recently I was working to setup a Big Data environment in Azure.
From Azure Data Factory I was spinning up an on-demand HDInsight Cluster with an external metastore.

Unfortunately I was always getting the following error: “Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient”.
After contacting Microsoft for support about this error they found the error was caused by a known Hadoop bug:
https://issues.apache.org/jira/browse/HIVE-12536

In short the error was caused by having dashes (-) in the name of the metastore database. After removing the dashes the problem disappeared and I was able to create the on-demand HDInsight cluster.

An excerpt of the error log, the name of my metastore was db-metastore-p:


Logging initialized using configuration in file:/C:/apps/dist/hive-0.14.0.2.2.9.1-1/conf/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/apps/dist/hadoop-2.6.0.2.2.9.1-1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/apps/dist/hbase-0.98.4.2.2.9.1-1-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:445)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:619)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1483)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:63)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:73)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2743)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2762)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:426)
... 8 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1481)
... 13 more
Caused by: javax.jdo.JDOUserException: Could not create "increment"/"table" value-generation container db-metastore-p.dbo.SEQUENCE_TABLE since autoCreate flags do not allow it.
NestedThrowables:
org.datanucleus.exceptions.NucleusUserException: Could not create "increment"/"table" value-generation container db-metastore-p.dbo.SEQUENCE_TABLE since autoCreate flags do not allow it.

5 thoughts on “HDInsight error with external metastore (Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient)

Add yours

  1. Hi Olandese,

    I’m facing the same problem. I’m using an ARM template to deploy my data factory, following a Microsoft tutorial (if you google ‘data factory build your first pipeline using ARM’ you will find it). My problem is that I can’t see any way to specify the metastore database name in the JSON parameters for the ARM template. How are you changing the metastore database name?

    Thanks!

    James

    Like

      1. Great, thanks for your help Olandese! Unfortunately this hasn’t worked for me, as I get the error ‘HCatalog integration is not enabled for this subscription.’ if I try to use that property.

        I have created an Azure SQL database, added a SQL database linked service to my data factory definition:

        {
        “dependsOn”: [ “[concat(‘Microsoft.DataFactory/dataFactories/’, variables(‘dataFactoryName’))]” ],
        “name”: “AzureSqlHiveMetastoreLinkedService”,
        “type”: “linkedservices”,
        “apiVersion”: “[variables(‘apiVersion’)]”,
        “properties”: {
        “type”: “AzureSqlDatabase”,
        “typeProperties”: {
        “connectionString”: “”
        }
        }
        }

        Then added the hcatalogLinkedServiceName property pointing to that linked service – but I get the following error when deploying the ARM template:

        ‘GpPrescribingDataFactory/HDInsightOnDemandLinkedService’ failed with message ‘HCatalog integration is not enabled for this subscription.’

        Do you have any further suggestions? As far as I’m aware my subscription doesn’t have any limitations.

        Like

      2. You are right, I was encountering the same problem on my private subscription (it was working on the enterprise subscription of the company I’m working for). I posted a question to Microsoft on the Yammer Azure Advisors and I got this answer: “we used to have hcat and schemageneration enabled prior to GA. For subs that were using it during public preview we automatically whitelisted them. We are working on re-enabling this feature soon”

        Like

  2. Aargh!

    Okay thanks for the information, you’ve been extremely helpful 🙂

    I’m trying to sign up to Azure Advisors now. It’s very frustrating because, as far as I can see, using Hive with on-demand HDInsight is basically impossible right now!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Up ↑

%d bloggers like this: