access_time 3 years ago languageEnglish
more_vert

Spark Read from SQL Server Source using Windows/Kerberos Authentication

visibility 3,833 comment 6
In this article, I am going to show you how to use JDBC Kerberos authentication to connect to SQL Server sources in Spark (PySpark). I will use  Kerberos connection with principal names and password directly that requires  Microsoft JDBC Driver 6.2  or above. The sample code can run ...
info Last modified by Administrator 2 years ago
thumb_up 0

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

comment Comments
2 years ago link more_vert
Raymond Raymond
web_assetArticles 583
imageDiagrams 40
forumThreads 9
commentComments 218
loyaltyKontext Points 6293
account_circleProfile
#1464 Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication
Sorry for the late reply. I've been busy recently and I have not got an environment with Spark and AD integrated. I will update you once I have that configured. 
format_quote

person venkatesan access_time 2 years ago
Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

We are using Spark 2.x. but the keytab was included in spark3.x could you please share the article to connect with SQL from the kerberos enabled spark cluster using kerberos authentication..

2 years ago link more_vert
V
venkatesan
web_assetArticles 0
imageDiagrams 0
forumThreads 0
commentComments 3
loyaltyKontext Points 3
#461 Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

We are using Spark 2.x. but the keytab was included in spark3.x could you please share the article to connect with SQL from the kerberos enabled spark cluster using kerberos authentication..

format_quote

person Raymond access_time 2 years ago
Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

For Kerberos enabled Spark cluster, it is usually used to authenticate with other Hadoop services like HDFS, Hive, HBase, etc. Access tokens will be retrieved from those services to be used in Spark application. There might be a way to use built-in functions to reuse but I am not familiar with those details.

To use Kerberos authentication to read data from SQL Server via keytab, you can pass in the keytab and principal parameters:

  • keytab Location of the kerberos keytab file (which must be pre-uploaded to all nodes either by --files option of spark-submit or manually) for the JDBC client. When path information found then Spark considers the keytab distributed manually, otherwise --files assumed. If both keytab and principal are defined then Spark tries to do kerberos authentication.
  • principal Specifies kerberos principal name for the JDBC client. If both keytab and principal are defined then Spark tries to do kerberos authentication.

Please follow this article find out more details: JDBC To Other Databases - Spark 3.1.1 Documentation (apache.org).

If you still could not work out the solution, I can find time to write a dedicated article for this.

2 years ago link more_vert
Raymond Raymond
web_assetArticles 583
imageDiagrams 40
forumThreads 9
commentComments 218
loyaltyKontext Points 6293
account_circleProfile
#460 Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

For Kerberos enabled Spark cluster, it is usually used to authenticate with other Hadoop services like HDFS, Hive, HBase, etc. Access tokens will be retrieved from those services to be used in Spark application. There might be a way to use built-in functions to reuse but I am not familiar with those details.

To use Kerberos authentication to read data from SQL Server via keytab, you can pass in the keytab and principal parameters:

  • keytab Location of the kerberos keytab file (which must be pre-uploaded to all nodes either by --files option of spark-submit or manually) for the JDBC client. When path information found then Spark considers the keytab distributed manually, otherwise --files assumed. If both keytab and principal are defined then Spark tries to do kerberos authentication.
  • principal Specifies kerberos principal name for the JDBC client. If both keytab and principal are defined then Spark tries to do kerberos authentication.

Please follow this article find out more details: JDBC To Other Databases - Spark 3.1.1 Documentation (apache.org).

If you still could not work out the solution, I can find time to write a dedicated article for this.

format_quote

person venkatesan access_time 2 years ago
Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

How this will work for kerberos enabled spark cluster.

Did you implement the ticket cache creation in python...? Please share it for reference.


2 years ago link more_vert
V
venkatesan
web_assetArticles 0
imageDiagrams 0
forumThreads 0
commentComments 3
loyaltyKontext Points 3
#459 Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

How this will work for kerberos enabled spark cluster.

Did you implement the ticket cache creation in python...? Please share it for reference.


format_quote

person Raymond access_time 2 years ago
Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

Hello,

The complete code is already provided here: Spark Read from SQL Server Source using Windows/Kerberos Authentication

The example code uses latest SQL Server JDBC driver which doesn't require keytab. Refer to the following article about how to generate Kerberos ticket using keytab (it also shows an example of generating that using Java programmatically):

Java Kerberos Authentication Configuration Sample & SQL Server Connection Practice



2 years ago link more_vert
Raymond Raymond
web_assetArticles 583
imageDiagrams 40
forumThreads 9
commentComments 218
loyaltyKontext Points 6293
account_circleProfile
#458 Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

Hello,

The complete code is already provided here: Spark Read from SQL Server Source using Windows/Kerberos Authentication

The example code uses latest SQL Server JDBC driver which doesn't require keytab. Refer to the following article about how to generate Kerberos ticket using keytab (it also shows an example of generating that using Java programmatically):

Java Kerberos Authentication Configuration Sample & SQL Server Connection Practice



format_quote

person venkatesan access_time 2 years ago
Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

Can you let me know how we set keytab location in the script.

Can you share the completed code..


2 years ago link more_vert
V
venkatesan
web_assetArticles 0
imageDiagrams 0
forumThreads 0
commentComments 3
loyaltyKontext Points 3
#457 Re: Spark Read from SQL Server Source using Windows/Kerberos Authentication

Can you let me know how we set keytab location in the script.

Can you share the completed code..


timeline Stats
Page index 4.64