SQL Train 2019

I’ve just returned home from PASS Summit 2019 and am taking a little bit of time to reflect on all I learned and to celebrate all of the great friends and colleagues that I connected with. Of course, my personal highlight was the SQL Train, on which me and 63 other SQL Server nerds traveled from Portland, OR to Seattle, WA.  This is an annual event for those traveling from SQL Saturday Oregon to Summit, and this was the fourth year in which I organized it.

Adrian Mee [B|T] was kind enough enough to share a couple of photos that he took of the event. Here we are boarding the two exclusive-occupancy train cars:

And here is a shot of how things looked once we were aboard:

No, I have no idea what had Neil Hambly [B|T] so freaked out either!

A pleasant surprise for this year was that even after all of the traditional SQL Train doughnuts, beer, and wine had been purchased, there was still money left over. Because of this I was very glad to be able to donate this surplus to Andy Mallon’s [B|T] #SQLFamily fundraiser for The Trevor Project on behalf of all those who rode this year’s SQL Train!

In the interest of full transparency I have included a screenshot of the financial details below:

Sadly, this will be the last SQL Train for at least two years. Next year’s PASS Summit will be held in Houston, TX, and it doesn’t look good for train routes from a nearby SQL Saturday. I’m sure that we’l figure out an appropriately fun replacement though!

Happy New Year!

For many of us in the Microsoft Data Platform community, PASS Summit is a bit like Christmas.  It’s a chance to get together with our SQL Family, make new friends, and start playing with all of the new toys that Microsoft has gifted to us.

By the same token, the first day back at work after Summit is akin to New Year’s Day.  We are hungover – often literally, but almost always metaphorically too – from the high of the previous week.  But we are also energized by the sense of community and shared excitement that we’ve just experienced.  It’s a great opportunity to turn that feeling into a new start!

So I am taking the opportunity to declare this day as the first day of my SQL year, and I’ve got a couple of New Year’s Resolutions to share:

1.  I resolve to spend at least one hour five days a week on physical exercise.  I spent all of this year’s Summit under the weather.  A sizable (pun intended) component of this is the weight that I’ve put back on in 2018 after losing fifty pounds in 2016 and 2017.  This has had a noticeable affect on my breathing, sleep, and general respiratory health, and it’s time to reverse the trend.

2.  I resolve to spend at least thirty minutes five days a week on professional development.  There’s so much cool technology to explore in the data platform space, and I find that as a consultant I am often too busy implementing techniques that I’ve done 1000 times before to learn new ones.  This will also include trying to produce more technical blog posts, as well as creating new content for presentations.  Speaking of which…

3.  I resolve to not submit any abstracts for new sessions or pre-cons until they are at least 75% developed.  I’ve caused myself a lot of unnecessary stress over the past couple of years by procrastinating about finishing my presentations.  I must do better on this front!

4.  I resolve to document the previous week’s activities on my blog.  Hopefully this will help me to keep myself accountable for sticking to the first three resolutions.

I invite you all to make your “New SQL Year’s Resolutions” with me, and look forward to toasting to our successes together at next year’s PASS Summit!

 

Introduction To sp_execute_external_script With Python, Part 1

I’ve been speaking on integrating Python and SQL Server for a while now, and having given the same presentation at nearly a dozen SQL Saturdays, PASS Summit, and multiple user groups, it’s about time to retire my favourite session.  Just in case there’s anyone out there who is interested in the subject but hasn’t managed to see it yet, I thought I’d start sharing some of the code and use-cases in a series of blog posts.

We’ll start off with a simple step-by-step introduction to the sp_execute_external_script stored procedure.  This is the glue that enables us to integrate SQL Server with the Python engine by sending the output of a T-SQL query over to Python and getting a result set back. .  For example, we could develop a stored procedure to be used as a data set in an SSRS report that returns statistical data produced by a Python library such as SciPy.  In this post we’ll introduce the procedure and how to pass a simple data set into and back out of Python, and we’ll get into manipulating that data in a future post.

If you’d like to follow along with me, you’ll need to make sure that you’re running SQL Server 2017 or later, and that Machine Learning Services has been installed with the Python option checked.  Don’t worry if you installed both R and Python, as they play quite nicely together.  The SQL Server Launchpad service should be running, and you should make sure that the ability to execute the sp_execute_external_script procedure is enabled by running the following code:

--  Introduction To sp_execute_external_script With Python - Query 1.00
--  Enabling sp_execute_external_script.

EXEC sp_configure 'external scripts enabled', 1;
RECONFIGURE;

 

As with any introduction to a new language, we’re contractually obligated to kick things off with a “Hello, world!”-style script.  You’ll note that our first procedure call below has only two parameters.  The @language parameter says that we’ll be using Python code as opposed to R, and that the Launchpad service should start instantiating the Python engine.   The @script parameter then contains the Python code that we’ll be executing, which in this case is a simple print statement:

--  Introduction To sp_execute_external_script With Python - Query 1.01
--  Calling the procedure with no input or output.

EXEC sp_execute_external_script
    @language       = N'Python',
    @script         = N'print("Welcome to the wonderful world of Python integration!")';

 

 

 

Now that we know our Python integration is working, the next step is to send some data over from SQL Server and get it back out.  We specify a T-SQL query to run in the @input_data_1 parameter; the SQL engine will execute this query and send the resulting data set over to Python.  The @script parameter has now been updated to simply say that we will take that data, identified by its default name of InputDataSet, copy it over to OutputDataSet (another default) and then return it to SQL Server.  I’ve got a tally table (a table of sequential numbers) set up in my Admin database, and I’ll begin by just sending across the first ten numbers from that table:

--  Introduction To sp_execute_external_script With Python - Query 1.02
--  Passing the top ten numbers from our tally table into Python
--  and getting them right back out.

USE Admin;
GO

EXEC sp_execute_external_script
    @language       = N'Python',
    @script         = N'OutputDataSet = InputDataSet',
    @input_data_1   = N'SELECT TOP 10 * FROM dbo.Tally;'
WITH RESULT SETS
((
    NewColumnName   INT NOT NULL
));

 

 

 

 

 

 

 

If for some reason you don’t have or are not able to add a tally table to your system, you can use the ROW_NUMBER() function against the sys.objects DMO instead:

--  Introduction To sp_execute_external_script With Python - Query 1.02b
--  Using sys.objects instead of a tally table.

EXEC sp_execute_external_script
    @language       = N'Python',
    @script         = N'OutputDataSet = InputDataSet',
    @input_data_1   = N'SELECT TOP 10 ROW_NUMBER() OVER (ORDER BY name) FROM sys.objects;'
WITH RESULT SETS
((
    NewColumnName   INT NOT NULL
));

 

 

There’s no need to keep the default name of “InputDataSet” for our input data set.  I’m quite partial to “myQuery” myself.  Renaming it is as simple as adding another parameter called @input_data_1_name.  Of course, we have to remember to update the python code in @script to reflect the new name:

--  Introduction To sp_execute_external_script With Python - Query 1.03
--  Using a custom name for the input data.

USE Admin;
GO

EXEC sp_execute_external_script
    @language           = N'Python',
    @script             = N'OutputDataSet = myQuery',
    @input_data_1       = N'SELECT TOP 10 * FROM dbo.Tally;',
    @input_data_1_name  = N'myQuery'
WITH RESULT SETS
((
    NewColumnName    INT NOT NULL
));

 

 

 

 

 

 

 

 

Unsurprisingly, we can also change the name of our output data set by adding the @output_data_1 parameter, and updating @script accordingly:

--  Introduction To sp_execute_external_script With Python - Query 1.04
--  Using custom names for both the input and output data.

USE Admin;
GO

EXEC sp_execute_external_script
    @language           = N'Python',
    @script             = N'Results = myQuery',
    @input_data_1       = N'SELECT TOP 10 * FROM dbo.Tally;',
    @input_data_1_name  = N'myQuery',
    @output_data_1_name = N'Results'
WITH RESULT SETS
((
    NewColumnName    INT NOT NULL
));

 

 

 

 

 

 

 

 

I should pause here to point out that although having parameters named @input_data_1, @input_data_1_name, and @output_data_1_name may imply that we can manage multiple input and output data sets with sp_execute_external_script, that is sadly not the case.  I’m hoping that this naming convention was chosen to allow room for that to happen in a future release (hint, hint, product team!)

Although this input query is very simple, in the real world we might be sending over the output of some very complex queries.  The input query in one of the use cases in the session I’ve been presenting is over 150 lines!  To make this more manageable in our integration, rather than embedding the whole query in our procedure call and making an unholy mess, we’ll instead store it in a variable called @InputQuery and pass that in to @input_data_1.  This allows us to use our standard T-SQL coding style, and has the added effect of being easy to copy-and-paste into a new query window for development and/or, heaven-forbid, debugging.

--  Introduction To sp_execute_external_script With Python - Query 1.05
--  Variablizing the input data query.

USE Admin;
GO

DECLARE @InputQuery     NVARCHAR(500);
SET @InputQuery = N'
SELECT TOP 10
    SequenceNumber
FROM
    dbo.Tally;
';

EXEC sp_execute_external_script
    @language           = N'Python',
    @script             = N'Results = myQuery',
    @input_data_1       = @InputQuery,
    @input_data_1_name  = N'myQuery',
    @output_data_1_name = N'Results'
WITH RESULT SETS
((
    NewColumnName    INT NOT NULL
));

 

 

 

 

 

 

 

(Yes, “variablizing” is a real word.  I checked.)

 

We can do the same thing with our Python code, which in this example we’ve stored in a variable named @PyScript and then passed in to the @script parameter:

--  Introduction To sp_execute_external_script With Python - Query 1.06
--  Variablizing both the input data query and the Python script.

USE Admin;
GO

DECLARE
    @InputQuery     NVARCHAR(500),
    @PyScript       NVARCHAR(500);

SET @InputQuery = N'
SELECT TOP 10
    SequenceNumber
FROM
    Admin.dbo.Tally;
';

SET @PyScript = N'
Results = myQuery
';

EXEC sp_execute_external_script
    @language           = N'Python',
    @script             = @PyScript,
    @input_data_1       = @InputQuery,
    @input_data_1_name  = N'myQuery',
    @output_data_1_name = N'Results'
WITH RESULT SETS
((
    NewColumnName    INT NOT NULL
));

 

 

 

 

 

 

 

 

Look at that nice clean procedure call!  Marvel at how the business logic in our T-SQL query and our Python code is separated out, formatted nicely, and ready to copy into another window or IDE!  Can we all now agree to use this format for our sp_execute_external_script calls going forward?

This concludes the end of part one, but we’ll return with part two very soon, in which we start to manipulate our data set in Python and see what happens when things don’t go so well…

 

Kentuckiana Mini-Tour

Having just finished leading the efforts on the very successful SQL Saturday Albuquerque, by all rights I should be taking this week off to recuperate.  Instead I’m starting a whirlwind “mini-tour” of Kentuckiana tomorrow, with three spots on which I’ll be presenting on SQL Server’s Machine Learning Services components.

Tuesday night I’ll be at Blue Stallion Brewing in Lexington, KY talking to the Lexington PASS group about how to leverage Python for statistical analysis in SQL Server 2017.  I’m looking forward to the unique experience of presenting with a pint glass in my hand rather than a bottle of water.  As long as I don’t try to use the pint glass like a laser pointer, I’ll be fine!  Register Here.

I don’t have any speaking engagements on Wednesday night, but will instead be heading up to Columbus, OH with my friend Matt Gordon [b|t] (who founded the Lexington PASS group) and his son to watch the Columbus Crew take on the Philadelphia Union.  As many of you know, I’m a devout soccer (real football) fan, and I’m really looking forward to visiting one of the most storied environments in US soccer.

Thursday is a double-header starting in Evansville, IN at 11:00 AM and Louisville, KY at 6:00 PM.  With a two-hour drive between them, not to mention the one-hour time difference, this will be an intense day.  It’s a good thing then that the session I’ll be presenting is the same for both groups!  Like the Lexington session I’ll be talking about statistical analysis, but this time I’ll be focusing on the R programming language, so the talk will be directly applicable to SQL 2016 as well as 2017.  Register Here for Evansville, IN.  Register Here for Louisville, KY.

If you live close to any of these three presentations, I’d love for you to come out so I can meet you and talk data and statistics with you!

 

I’m Speaking – December 2017

After a relatively light speaking schedule in November – just PASS Summit this year – I’ll be back with a vengeance in December.  I’ve got three sessions lined up, and all in venues in which I’ll be presenting for the first time!

I’ll be kicking the month off bright and early on Friday, December 1st with the winter edition of the GroupBy conference.  This is a virtual conference held over the first two Fridays in December, with ten sessions voted in by the community.  I was lucky enough to have my SQL Server R Services session selected this time around, and I hope to “see” you all there at 10:30 am EST (17:30 UTC).

I’ll be following that up with a trip to the nation’s capital the following week to present the same session at SQL Saturday Washington DC on Saturday, December 9th.  I was able to attend last year to see some great presentations, but this year the organizers have made the mistake of asking me to speak too!  I’m looking forward to following the excellent Kevin Feasel [b|t] once again as our sessions complement each other’s quite well.

For my final speaking slot of the year, I’ll be presenting virtually to the Orange County BIG PASS group headed up by my friend Mickey Stuewe [b|t].  I’ll be dusting off an older presentation on implementing row-based security through a single custom solution that works for both SSRS and SSAS.  If you’ll be in the Lake Forest area on Thursday, December 14th, please register on their Meetup group and join me for this one.

What a great way to close out another busy year!

 

T-SQL Tuesday #96: Folks Who Have Made A Difference

This month’s T-SQL Tuesday is hosted by Ewald Cress [b|t], who has invited people to give public shout-outs to those who have helped them in their career or made a meaningful contribution to their “life in the world of data.”

I would like to express my gratitude to two people in particular:  Steve Jones [b|t] and Grant Fritchey [b|t].  The guidance and encouragement that these two have given me have significantly boosted my presentation skills and confidence, and in turn transformed my career as a data professional.  Let me tell you a bit more.

Steve has been very supportive and encouraging since I began speaking at SQL Saturday events.  I remember one presentation in Denver particularly where I wasn’t happy with how it had gone, but Steve was kind enough to say how much he liked the way I’d handled the demo portion, and even recommended to a veteran speaker that he could do worse than follow my example.  You can imagine what a boost to my speaking confidence that was!  He was also the first person to really push me to submit my R session to Summit last year when I didn’t think I was quite ready yet to step up to that level.  Thank you for your backing and inspiration, Steve!

Grant attended a presentation of mine in Albany, and I must admit I was a bit unnerved to have one of the “big names” of SQL in my session.  Afterwards he  provided me with a comprehensive set of pointers including the things I did well and those with which I needed improvement.  This list was instrumental in improving my speaking skills, and is the kind of valuable feedback that is rare to receive.  The really amazing thing to me was that he prepared this list for me while on vacation with his family!  Thank you for your generosity and guidance, Grant!

To Steve and Grant, I can’t thank you enough for the impact you’ve had on my career.  Not only have you helped me grow, you’ve also shown me a model for how I can help others in the SQL community grow too.  Now hopefully I can do the same for someone else on their journey.

SQL Train 2017

One of my highlights of PASS Summit season was the SQL Train, which ferried me and 35 fellow SQL Server nerds from Portland, OR to Seattle, WA.  This is an annual event for those traveling from SQL Saturday Oregon to PASS Summit, and this was the second year for which I was one of the organizers.

In 2015, the train had 20 loud and unruly SQL passengers, much to the chagrin and annoyance of the other 16 people stuck with us in the car!  Last year Mickey Stuewe [B|T] and I decided to organize the event more formally to see if we could get 36 people on board – enough to fill our own, dedicated train car – and it was a great success.

This year, Mickey stepped back from the organizing duties, leaving it all up to me!  Things ran very smoothly this time around.  Almost all of the passengers paid up on time, and everyone showed up at the correct train station right when they were supposed to do.  The only minor wrinkle was that a few folks had to drop out in the week before the trip, but we were able to fill their seats quickly with people from the wait list.

SQL Train 1
Picture courtesy Slava Murygin [B|T]
This year the train had SQL passengers from all over the globe: Australia, Canada, Iceland, the Netherlands, New Zealand, Norway, Portugal, South Africa, and from at least ten states within the U.S!

Of course when this many data professionals get together, especially after a full day of learning and presenting at SQL Saturday Oregon, we sometimes have to let off a little steam.  In fact, I’m not sure I’ve ever seen Peter Kral [T] quite as happy as when he produced this bottle of Irish cream:

SQL Train 2
Picture courtesy Cathrine Wilhelmsen [B|T]
We also did something a little different this year and charged a slight premium to the cost of the ticket, which enabled us to purchase some snacks and beverages for the car.  Many thanks to Bob, Dan, Heidi, Jared, Peter, and Rob for their help in acquiring these items and carrying them on to the train!  Here’s how that money was spent:

I’m already looking forward to next year’s SQL Train, which will be on November 4th, 2018.  With any luck we will get enough interest to fill two full train cars this time!