Raymond Raymond

Connecting Apache Zeppelin to your SQL Server

event 2018-04-22 visibility 4,122 comment 7 insights toc
more_vert
insights Stats

This page demonstrates the steps you need to connect to SQL Server in Zeppelin. There are many ways to implement this, for example SQL Server interpreters in GitHub. In this page, I am going to use the JDBC driver to connect to SQL Server instead of using third party interpreters.

For authentication, I am using database login (SQL Server authentication) instead of Windows login since my machine is not part of any domain. If you want to setup Kerberos authentication, please refer the following link about how to configure ktab file name and SPN:

Java Kerberos Authentication Configuration Sample & SQL Server Connection Practice

Prerequisites

If you have not installed Zeppelin, follow the link below to setup your local instance:

Install Zeppelin 0.7.3 in Windows

Java JDK is also required, which is included as part of the above installation guide.

Connect to SQL Server

Setup SQL Server account

In my machine, I’ve setup one SQL Server login in my local SQL Server instance:

Login: zeppelin

Password: zeppelin

The above credential is going to be used when setting up interpreter.

Setup interpreter

In Zeppelin website, click Interpreter menu.

image

Click create button.

Input the following:

  • Interpreter Name: tsql
  • Interpreter group: jdbc

image

Setup the following properties:

  • default.driver: com.microsoft.sqlserver.jdbc.SQLServerDriver
  • default.password: zeppelin
  • default.url: jdbc:sqlserver://localhost
  • default.user : zeppelin
  • zeppelin.jdbc.auth.type: SIMPLE
  • artifact: com.microsoft.sqlserver:mssql-jdbc:6.5.1.jre8-preview

image
You can change the property values to your own environment setup.

For the artifact setting, you can use local lib file path or using groupid:artifact id:version. In my case, I am using the following SQL Server JDBC library.

https://search.maven.org/#artifactdetails%7Ccom.microsoft.sqlserver%7Cmssql-jdbc%7C6.5.1.jre8-preview%7Cjar

Click Save button to save this new interpreter.

Create note

Now you can create a new note with the new interpreter (%tsql).

The following is a sample code I use:

%tsql

select * from sys.all_objects where type='U'

The screenshot below shows the result:

image

Summary

It’s very easy to use JDBC to connect to SQL Server or any other JDBC compatible databases, for example, Oracle and Teradata.

Once you can connect to the database, you can then do analytics easily in Zeppelin.

More from Kontext
comment Comments
Raymond Raymond #217 access_time 5 years ago more_vert

Hi,

You got that error because Zeppelin cannot find the SQL Server JDBC driver.

Have you setup the dependencies for the interpreter as shown in the screenshot above? 

artifact: com.microsoft.sqlserver:mssql-jdbc:6.5.1.jre8-preview

Make sure Zeppelin install this artifact from internet successfully. 

format_quote

person Son Nguyen access_time 5 years ago

After running the notebook, I encountered this error:
java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.zeppelin.jdbc.JDBCInterpreter.createConnectionPool(JDBCInterpreter.java:412) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnectionFromPool(JDBCInterpreter.java:423) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:486) at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:692) at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:820) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:632) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) at jaat java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
  
Please help me fix it!
hide_source Anonymous #64 access_time 5 years ago more_vert
After running the notebook, I encountered this error:
java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.zeppelin.jdbc.JDBCInterpreter.createConnectionPool(JDBCInterpreter.java:412) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnectionFromPool(JDBCInterpreter.java:423) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:486) at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:692) at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:820) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:632) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) at jaat java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
  
Please help me fix it!
hide_source Anonymous #63 access_time 5 years ago more_vert

I am also running v0.7.3.

I looked into that workaround. When I tried to implement it, I got other errors such as tool.jar and other basic java errors. It started feeling like I was chasing a rabbit so before I spent more time with the workaround I've been going back and retracing my steps making sure that I didn't forget something. So far I haven't had any luck finding a solution and may turn back to that work around to see if I can chase that rabbit some more to see if that solves it.

Raymond Raymond #216 access_time 5 years ago more_vert

I could not try Kerberos locally in my computer as it is not in a domain but I definitely tried in the past in one project. 

Your issue happened because Zeppelin's VM cannot find the package that contains that required class.

It seems to be the same as the following one:

https://issues.apache.org/jira/browse/ZEPPELIN-1326

What's your version of Zeppelin? Mine is Version 0.7.3.

Have you tried the workaround in that post to add a dependency in the interpreter settings or upgrade your Zeppelin?

When I changed my interpreter's URL to use Kerberos authentication ( jdbc:sqlserver://localhost;integratedSecurity=true;authenticationScheme=JavaKerberos;), I will get the following error as I didn't (and also couldn't) configure Kerberos as my other post suggests:

Caused by: GSSException: Invalid name provided (Mechanism level: KrbException: Cannot locate default realm)

format_quote

person Michael Black access_time 5 years ago

Thanks for the links. Side note, I haven't found too many others who have used Data Stage either. If you're ever bored and want to try making Zeppelin use Windows Auth, maybe you'll have better luck than me. In theory I agree with you that it should be possible but so far I haven't been lucky. I suspect it's something more on the Zeppelin side.

java.lang.NoClassDefFoundError: org/apache/hadoop/security/UserGroupInformation$AuthenticationMethod
at org.apache.zeppelin.jdbc.security.JDBCSecurityImpl.getAuthtype(JDBCSecurityImpl.java:65)
at org.apache.zeppelin.jdbc.security.JDBCSecurityImpl.createSecureConfiguration(JDBCSecurityImpl.java:42)
at org.apache.zeppelin.jdbc.JDBCInterpreter.open(JDBCInterpreter.java:190)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
hide_source Anonymous #62 access_time 5 years ago more_vert

Thanks for the links. Side note, I haven't found too many others who have used Data Stage either. If you're ever bored and want to try making Zeppelin use Windows Auth, maybe you'll have better luck than me. In theory I agree with you that it should be possible but so far I haven't been lucky. I suspect it's something more on the Zeppelin side.

java.lang.NoClassDefFoundError: org/apache/hadoop/security/UserGroupInformation$AuthenticationMethod
at org.apache.zeppelin.jdbc.security.JDBCSecurityImpl.getAuthtype(JDBCSecurityImpl.java:65)
at org.apache.zeppelin.jdbc.security.JDBCSecurityImpl.createSecureConfiguration(JDBCSecurityImpl.java:42)
at org.apache.zeppelin.jdbc.JDBCInterpreter.open(JDBCInterpreter.java:190)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Raymond Raymond #215 access_time 5 years ago more_vert
I have not tried but it is definitely doable.
You need to configure Kerberos first. 
Please refer to the following page for more details about how to do Kerberos authentication:
After configuration, you can change the authentication schema to JavaKerberos:
jdbc:sqlserver://servername=server_name;integratedSecurity=true;authenticationScheme=JavaKerberos;

If your SQL Server and Zeppelin are installed in stand-alone machines, you can use NTLM authentication (instead of Kerberos). For more details, please refer to this post:
format_quote

person Mike access_time 5 years ago

Have you been able to connect using Kerberos as the authentication method instead of local sql authentication?

hide_source Anonymous #61 access_time 5 years ago more_vert

Have you been able to connect using Kerberos as the authentication method instead of local sql authentication?

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts