The Old Reader

19 Apr 00:15

Your #1 Job….

by Karen Lopez

I hear frequently, especially from the DBA groups, that our number one job as a data professional is performance. That typically includes making sure database queries run fast, that systems have expected uptimes, and that developers/DBAs can do their jobs as fast as possible without slowing down to consider whether or not they are doing the right thing for the data. In fact, I’ve been told many times that data quality is Job NULL, meaning that we shouldn’t care as much about data quality as we do about performance. The crazy things I’ve read: query running slow? Delete some rows and see if anyone notices. Assign numeric datatypes to number-like columns so they will be smaller (and missing leading zeros). Make columns small, even if it means losing data. Shove data in a column with comma delimiters so that you don’t have to change the database. Re-use a column for something it was never intended for.

Developers and DBAs start thinking this way, for the most part, because they are measured and rewarded based on all kinds of factors other than data quality. And yet management expects systems to support exactly what Tim Berners-Lee says in this quote. Sure, making systems purr is one part of allowing data to be connected across sources. But misleading data, mis-understoood data and plain old bad data means that CEOs can’t run a company effectively.

Any enterprise CEO really ought to be able to ask a question that involves connecting data across the organization, be able to run a company effectively, and especially to be able to respond to unexpected events.

Most organizations are missing this ability to connect all the data together.

There are all kinds of presentations and blog posts about how to make systems run fast. There are so few about how to love your data so that the CEO can rely on it. The first person that needs to fix this mismatch of incentives and actions is the CEO. She needs to ensure that IT professionals are properly evaluated and motivated to produce both fast data and correct data. And to stop providing incentives for IT professionals to work against data quality.

DBAs and Developers want to do the right thing. It’s just that we are paying them to do the wrong things over the right things.

14 Apr 18:33

Three Wishes from SQLPASS.org in 2015

by KKline

I'm not above giving the PASS board of directors a piece o'my mind. So here are a few suggestions that I think could further improve the organization and offer a lot of value back to the community....(read more)

14 Apr 18:32

SQL Server and 20 Cores Limit

by tlachev

Scenario: You execute a SQL Server 2012 task that uses parallelism, such as index rebuild or a query on a server with more than 20 cores running SQL Server 2012 Enterprise Edition. In the Windows Task Manager, you observe that the task uses only 20 cores. We discovered this scenario during a rebuild of a columnstore index. To confirm this further, you examine the SQL Server log and notice that a similar message is logged when the SQL Server instance starts:

“SQL Server detected 8 sockets with 4 cores per socket and 4 logical processors per socket, 32 total logical processors; using 20 logical processors based on SQL Server licensing. This is an informational message; no user action is required.”

Explanation: More than likely, you have upgraded to SQL Server 2012 from SQL Server 2008 R2 under Software Assurance. Microsoft created a special SKU of Enterprise Edition to support this scenario with the caveat that this SKU limits an instance to using only 20 processor cores (or 40 CPU threads if hyperthreading is enabled). If this level of parallelism is not enough, the only solution is to switch to the Enterprise Edition SKU that is licensed per core and purchase a license that covers as many cores as needed. Once you obtain the new license key, you can upgrade your SQL Server instance:

Setup.exe /q /ACTION=editionupgrade /INSTANCENAME=MSSQLSERVER /PID=<PID key for new edition>" /IACCEPTSQLSERVERLICENSETERMS

14 Apr 18:32

Important change to VLF creation algorithm in SQL Server 2014

by Paul Randal

Since SQL server 2014 was released back in April last year, there have been some rumblings about changes to how many VLFs are created when the log is grown or auto-grown (I’ll just say auto-grown from now on, as that’s the most common scenario). I experimented a bit and thought I’d figured out the algorithm change. Turns out I hadn’t. There was a question on the MVP distribution list last week that rekindled the discussion and we collectively figured out that the algorithm was behaving non-deterministically… in other words we didn’t know what it was doing. So I pinged my friends in CSS who investigated the code (thanks Bob Ward and Suresh Kandoth!) and explained the change.

The change is pretty profound, and is aimed at preventing lots of auto-growth from creating huge numbers of VLFs. This is cool because having too many (it depends on the log size, but many thousands is too many) VLFs can cause all manner of performance problems around backups, restores, log clearing, replication, crash recovery, rollbacks, and even regular DML operations.

Up to 2014, the algorithm for how many VLFs you get when you create, grow, or auto-grow the log is based on the size in question:

Less than 1 MB, complicated, ignore this case.
Up to 64 MB: 4 new VLFs, each roughly 1/4 the size of the growth
64 MB to 1 GB: 8 new VLFs, each roughly 1/8 the size of the growth
More than 1 GB: 16 new VLFs, each roughly 1/16 the size of the growth

So if you created your log at 1 GB and it auto-grew in chunks of 512 MB to 200 GB, you’d have 8 + ((200 – 1) x 2 x 8) = 3192 VLFs. (8 VLFs from the initial creation, then 200 – 1 = 199 GB of growth at 512 MB per auto-grow = 398 auto-growths, each producing 8 VLFs.)

For SQL Server 2014, the algorithm is now:

Is the growth size less than 1/8 the size of the current log size?
Yes: create 1 new VLF equal to the growth size
No: use the formula above

So on SQL Server 2014, if you created your log at 1GB and it auto-grow in chunks of 512 MB to 200 GB, you’d have:

8 VLFs from the initial log creation
All growths up to the log being 4.5 GB would use the formula, so growths at 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5 GB would each add 8 VLFs = 56 VLFs
All growths over 4.5 GB will only create 1 VLF per growth = (200 – 4.5) x 2 = 391 VLFs
Total = 391 + 56 + 8 = 455 VLFs

455 is a much more reasonable number of VLFs than 3192, and will be far less of a performance problem.

A commenter asked whether compatibility level affects this? No – compatibility level is ignored by the Storage Engine internals.

I think this is an excellent change and I can’t see any drawbacks from it (apart from that it wasn’t publicized when SQL Server 2014 was released). CSS will be doing a comprehensive blog post about this soon, but they were cool with me making people aware of the details of the change ASAP to prevent confusion.

You might think that it could lead to very large VLFs (e.g. you set a 4 GB auto-growth size with a 100 GB log), and it can. But so what? Having very large VLFs is only a problem if they’re created initially and then you try to shrink the log down. At a minimum you can only have two VLFs in the log, so you’d be stuck with two giant VLFs at the start of the log and then smaller ones after you’d grown the log again. That can be a problem that prevents the log being able to wrap around and avoid auto-growth, but that’s not anywhere near as common as having too many VLFs. And that’s NOT a scenario that the new algorithm creates. (As an aside, you can fix that problem by creating a database snapshot and then reverting to it, which deletes the log and creates a 0.5 MB log with two tiny VLFs… it’s a ~~bug~~feature that’s been there since 2005, but it breaks your log backup chain when you do it.)

There’s certainly more that can be done in future around VLF management (e.g. fixing the problem I describe immediately above), but this is a giant step in the right direction.

Enjoy!

14 Apr 18:31

Log Buffer #98

by Jeff Smith

Hello and welcome to the 98th edition of Log Buffer. My name is Jeff Smith and I will hosting this week's exciting episode. If, for some reason, you are not completely satisfied with this edition, simply write in and complain to Dave over at The Pythian Group and you will receive Log Buffer #99 absolutely free! Now that is a guarantee you can feel good about. OK, let's get to work.

I have only limited exposure to both PostgreSQL and MySQL, but I have often wondered why MySQL is so popular while it seems that PostgreSQL has the superior features. Over at Xaprb, they attempt to answer that very question. Be sure to read the comments from that post, and check out the big discussion from that article over at reddit as well. The theory I like the best? MySQL is easier to pronounce! (How do you pronounce "PostgreSQL" anyway?)

Speaking of MySQL, Sheeri Cabral points out that MySQL's website certainly doesn't do the product any favors, and there's also a good discussion at Xaprb on why MySQL is Free Software but not Open Source. If you ever wanted to add a new Unicode collation to MySQL, Alexander Barkov and Peter Gulutzan provide all the information you'll need. Peter at the MySql Performance Blog tells us that MySQL lacks a good memory profiling tool, and based on his feedback, others seem to agree. (No, not those Others!) Speaking of MySQL feature requests, Justin Swanhart asks "Why does INFORMATION_SCHEMA fail to show information about TEMPORARY tables?" and also lets us know that his materialized view stored procedures for MySQL have been OKed for release. Sunny Walia (what a great name -- is it possible to not be a fun person with a name like that?) tells us how to install innotop to monitor innodb information in real-time and wonders "Oh dear MySQL slave, where did you put those rows?" Going back to the MySQL Performance Blog, Vadim warns us of a dangerous MySQL command; be sure to keep that one locked safely away from the kids.

Regarding a product I actually know a little about, Kalen Delany has a nice list of Free SQL Server Troubleshooting Tools to check out. If you haven't seen it yet, my co-blogger here at SQLTeam Mladen has an amazingly popular list of Free SQL Server Tools that might make your life a little easier that was published a while back but it is always worth mentioning. While you are visiting Mladen's blog, don't miss his latest post on getting immediate deadlock notifications for SQL Server 2005. Also, be sure to leave him lots of comments telling him that his blog is great but that he is your second favorite SQL Server blogger -- after me, of course!

Still on the topic of SQL Server, Jamie Thomson provides us with a tip for ensuring that your root folder is valid when using SSIS. Denis Gobo asks: What did you do to master SQL? (Interestingly enough, for me it was by learning MS Access first!) Tony Rogerson warns us of the performance implications of using Row_Number() in non-recursive CTE's. And Paul S. Randal describes a CHECKDB bug that people are hitting; thankfully, he says that "you can only hit this bug if you ALREADY have corruption, that it's quite rare, and that there is a workaround."

Everyone enjoys a good analogy, right? After all, a good analogy is like an ice cream cone: they both are ... hmmm ... OK, well, that's not a good analogy at all. Never mind. Speaking of bad analogies, I bet that unlike Peter Gulutzan you never really thought about the expression "half baked" before and how it relates to MySQL features. Well, now there's your chance! (Of course, a pessimist would prefer "half un-baked", but that's a discussion for another time.)

A big topic lately has been SQL Injection attacks. I always find this funny because this is the easiest problem to avoid in the history of programming; as CodeAssembly tells us, "Never concatenate user input to your queries, without exceptions." That's really all there is to it -- do that, and you are good to go. As I've written before, using parameters is not only safer, but your code is much shorter and simpler than if you concatenate strings all day long.

Federico Cargnelutti gives us an introduction on managing and applying database changes with LiquiBase, an "open source, DBMS-independent library for tracking, managing and applying database changes." I have never used LiquiBase, but sounds like something worth looking into.

While reading Magnus Hagander's PostgreSQL Blog, I found out that Yahoo claims it has the largest SQL database in a production environment -- and they use PostgreSQL. Impressive! Peter Eisentraut checks in from PGCon Day One, which included a presentation of his on porting Oracle Applications to PostgreSQL. For those of you out there using Max OS X, Perldiver has summarized instructions on building PostgreSQL on Mac OS X. Going off on a tangent, they just opened a new Apple store here in Boston on Boylston street. I visited it this weekend after getting my usual bad haircut next door. My verdict on the store? It sure looks nice, but I had no luck finding a new 5 1/4" floppy drive for my Apple II. Try to do better next time, Apple!

James McGovern offers some praise for Mark Wilcox of Oracle. Why? Because Mark has been doing some must-read blogging over at the Oracle.com blogs. Getting back to my favorite topic, which is coding SQL, Michael Armstrong-Smith instructs the Oracle crowd on using CASE to solver Outer Join issues. Shay Shmeltzer provides some tips on creating a master with two details on the same page when using ADF.

Sticking with Oracle links, Pete Finnigan ponders read only tables or read only users, and notes that in Oracle a read-only user "has approximately 27,000 other privileges because of grants to PUBLIC. This is the killer issue as because of this it is in fact not possible to create a read-only user." Hmm ... only 27,000? Come on, that doesn't seem that bad to me! Eddie Awad tells us about the Lazy Developer's way to populate a Surrogate Key and over at the Oracle Scratchpad, Jonathan Lewis provides some helpful links on Index Efficiency. Finally, if you are looking to install Oracle Database 11g Release 1 on Fedora 9 (and who isn't?), everything you need to know is covered over at Oracle-Base.

Now, if you're like me, you hate DBAs. Ah, just kidding, of course we all love our Database Administration Overlords (and I'm not just saying that because most of the people reading this probably are DBAs.) However, even the best DBAs out there occasionally make mistakes. If you have some horror stories of your own to share, or if you simply want to take pleasure in the misfortune of others, be sure to check out Kalen Delany's call for DBA Blunders. (Of course, to be fair, even us developers occasionally make mistakes.) To help avoid future blunders, consider this advice: Do not use Windows System Restore as a backout plan for SQL Server Service Packs, Cumulative Updates, or HotFixes. Also, Tara here at SQLTeam reminds us to optimize your tempdb and even provides a helpful script. I'd like to add my own helpful tip for DBAs: Schedule regular database backups! Remember, you read it here first.

Previewing the upcoming SQL Server 2008, SQLTeam's Derek discusses the Data Profiling Utility with SQL Server 2008. It sure seems like a nice tool, but that still may not make people any less nervous about SQL 2008. (Heck, my team is still nervous about SQL Server 2005!) Aaron Betrand urges people to vote if you want IntelliSense in SSMS 2008 to also support SQL Server 2005, which seems like a great idea to me, and Linchi Shea has a quick analysis of SQL Server 2008 Page Compression and its performance impact on table scans. Finally, Jamie Thomson dissects the fuzzyness of SQL Server 2008. To me, "fuzziness" is what happens to my vision after drinking too many mojitos, but Jamie is discussing a new feature in SSIS 2008 so give it a read.

I've always believed that you don't truly know all there is to know about databases until you understand the raw data structures of tables and indexes and so on. Over at MSDN Channel 9, there's a new series of videos on Data Structures and Algorithms, so be sure to watch if you want to know how database engines really work "under the hood". My enjoyment of the video was unfortunately interrupted by horrible flashbacks from my CS310 days.

Moving away from relational databases, Jim Wilson helps us to understand HBase and BigTable. Apparently, HBase is the open source implementation of Google's BigTable database, which is described as a "sparse, distributed, persistent multidimensional sorted map." In layman's terms, that means "a database with wicked huge tables."

For those looking for a laugh, be sure to read Andrew Calvett's MS SQL Server Book of Wisdom. It reminds me quite a bit of my infamous and widely misinterpreted Top 10 Things I Hate About SQL Server post from way back in the olden days. Be careful, Andrew: sometimes folks don't get it if your jokes are too subtle! (Of course, in my case, it could be that my jokes just weren't that funny.)

Well, that's all for this week. Thanks, Dave, for giving me an opportunity to write this week's Log Buffer. It was lot of fun and a welcome opportunity for me to spend more time than I usually do reading lots of great blog posts from around the internet. Have a great weekend everyone!

14 Apr 18:30

Rebuild clustered columnstore indexes when they require maintenance

by Greg Low

For general index maintenance, our friend Old Hallengren has an awesome solution for most people: https://ola.hallengren.com/sql-server-index-and-statistics-maintenance.html

We’ve started to have customers using clustered columnstore indexes in SQL Server 2014, and they need to be treated differently. I checked Ola’s latest scripts today to see what happens with columnstore indexes. It appears that the code ignores nonclustered columnstore indexes (ie: index type of 6), which makes sense as we need to rebuild them whenever the data changes, and in the meantime, the table is read-only. So that makes lots of sense.

For other indexes (including clustered columnstore indexes which are updatable), the code uses the same avg_fragmentation_in_percent values from sys.dm_db_index_physical_stats. Unfortunately, that view doesn’t return any data for columnstore indexes of either type, so that’s not actually helpful.

Niko has an awesome series of blog posts about columnstore indexes that I encourage you to read: http://www.nikoport.com/columnstore/ In part 36, he discusses what he thinks is needed for maintenance. The metrics that I want to use overlap the ones that he’s suggested in that article.

With a clustered columnstore index, when a row is deleted, a flag in a bitmap is updated to indicate the deletion. The row is still present in the columnstore row group. When an update occurs, the same mechanism flags the row as deleted, then inserts the new version into the delta store.

The main problems that will occur are when many rows have been deleted.

In the code below, I’ve provided an implementation that will rebuild clustered columnstore indexes in a database when one of three criteria is met:

The overall percentage of deleted rows has exceeded a cutoff (I’ve suggested a 10% default)
The percentage of rows in any individual segment exceeds a cutoff (I’ve suggested a 20% default)
The number of segments that are completely empty (all rows have been deleted) has exceeded a cutoff (I’ve suggested none)

This would work for both COLUMNSTORE and COLUMNSTORE_ARCHIVE compression.

Hope it helps someone until Ola does his magic:

----------------------------------------------------------------------

-- Rebuild clustered columnstore indexes when necessary

-- Dr Greg Low v1.0

----------------------------------------------------------------------

-- rebuild when more than supplied % of rows have been deleted

-- rebuild when any segments contain more than supplied % deleted rows

-- rebuild if more than supplied number of segments are empty

DECLARE @DeletedTotalRowPercentage int = 10;

DECLARE @DeletedSegmentsRowPercentage int = 20;

DECLARE @EmptySegmentsAllowed int = 0;

----------------------------------------------------------------------

DECLARE @IndexesToRebuild TABLE (SchemaName sysname,

TableName sysname,

IndexName sysname);

WITH ClusteredColumnstoreIndexes

( SELECT t.object_id AS ObjectID,

SCHEMA_NAME(t.schema_id) AS SchemaName,

t.name AS TableName,

i.name AS IndexName

FROM sys.indexes AS i

INNER JOIN sys.tables AS t

ON i.object_id = t.object_id

WHERE i.type = 5

RowGroups

( SELECT csrg.object_id AS ObjectID,

csrg.total_rows AS TotalRows,

csrg.deleted_rows AS DeletedRows,

csrg.deleted_rows * 100.0 / csrg.total_rows AS DeletedPercentage,

CASE WHEN csrg.total_rows = csrg.deleted_rows

THEN 1 ELSE 0

END AS IsEmptySegment

FROM sys.column_store_row_groups AS csrg

IndexStats

( SELECT cci.ObjectID,

cci.SchemaName,

cci.TableName,

cci.IndexName,

SUM(rg.TotalRows) AS TotalRows,

SUM(rg.DeletedRows) AS DeletedRows,

SUM(rg.DeletedRows) * 100.0 / SUM(rg.TotalRows) AS DeletedPercentage,

SUM(rg.IsEmptySegment) aS EmptySegments

FROM ClusteredColumnstoreIndexes AS cci

INNER JOIN RowGroups AS rg

ON cci.ObjectID = rg.ObjectID

GROUP BY cci.ObjectID, cci.SchemaName, cci.TableName, cci.IndexName

)

INSERT @IndexesToRebuild (SchemaName, TableName, IndexName)

SELECT s.SchemaName, s.TableName, s.IndexName

FROM IndexStats AS s

WHERE s.DeletedPercentage > @DeletedTotalRowPercentage

OR s.EmptySegments > @EmptySegmentsAllowed

OR EXISTS(SELECT 1 FROM RowGroups AS rg

WHERE rg.ObjectID = s.ObjectID

AND rg.DeletedPercentage > @DeletedSegmentsRowPercentage);

DECLARE @SchemaName sysname;

DECLARE @TableName sysname;

DECLARE @IndexName sysname;

DECLARE @SqlCommand nvarchar(max);

DECLARE IndexList CURSOR FAST_FORWARD READ_ONLY

FOR

SELECT SchemaName, TableName, IndexName

FROM @IndexesToRebuild

ORDER BY SchemaName, TableName, IndexName;

OPEN IndexList;

FETCH NEXT FROM IndexList INTO @SchemaName, @TableName, @IndexName;

WHILE @@FETCH_STATUS = 0

BEGIN

SET @SqlCommand = N'ALTER INDEX ' + QUOTENAME(@IndexName)

+ N' ON ' + QUOTENAME(@SchemaName)

+ N'.' + QUOTENAME(@TableName)

+ N' REBUILD PARTITION = ALL;';

PRINT @SqlCommand;

EXEC(@SqlCommand);

FETCH NEXT FROM IndexList INTO @SchemaName, @TableName, @IndexName;

END;

CLOSE IndexList;

DEALLOCATE IndexList;

14 Apr 18:30

Demos must work–simple concept? So why is it so rarely applied?

by Greg Low

When I attend events like TechEd, like many people I usually find the networking time more valuable than the session time. There is a pretty tight limit on the number of sessions you can attend, no matter how hard you try. So I often watch the sessions later when I can. At one of the earliest TechEd Australia events that I attended, they gave us CDs of the TechEd USA sessions. That was great because that TechEd had around 250 sessions, and there were over 150 that I would have loved to attend. Clearly that wasn’t possible.

I was doing a lot of driving to/from client sites at that time, and so I had a lot of available listening time. I dragged the audio out of all the TechEd session videos and just listened to the audio. If a session really interested me, I’d go back and watch the video and demos.

That year, I listened to/watched around 150 sessions. While it was intense and interesting, there was something that I wasn’t expecting, that completely stunned me. I heard the presenters apologising for demo failures in close to 100 of those sessions. I found that really, really hard to believe. I was determined that when I was presenting any sort of session at events like these, that I wasn’t going to be one of those people, or I’d do my very best to avoid it.

Having your session work as planned already puts you in the top third of all the sessions.

So if you want to be on the right side of this equation, what can you do?

1. Have realistic goals for the amount of content.

I normally aim to tell people three things in a session. They certainly won’t remember more than that and it’s the stories that they’ll remember anyway, so make sure that each demo has a good story associated with it. And if showing any one of these three things takes more than about 15 to 20 minutes, try again. Blaise Pascal said “I would have written a shorter letter, but I did not have the time”. Plan the content and tell the story succinctly. Plan your timing. It’s hard work to get just the right message in just the right amount of time.

I have lost count of how many sessions I’ve been to that run out of time or that failed to make one of the key points. Don’t be one of them.

You’ll be especially sorry if your session description includes content that you didn’t end up covering. Someone might have come just for that content.

2. Aim for repeatable achievable outcomes.

I’ve seen so many demos that would probably only ever work with the moon in the correct position and the presenter holding his/her head the right way. Don’t do this.

3. Have a clear structure.

There’s perceived wisdom that sessions should be all demos. I don’t buy it as the only rule. I’ve been to brilliant sessions with none, and I’ve been to horrid sessions from amazing people where they have no structure to what they are trying to show.

4. Practice both the session and the demos.

And I mean multiple times. The bigger the event, the more the whole session needs to be second nature. Try to deliver the session at smaller venues first. Local user groups, virtual sessions, etc. are good options for this.

5. Find another presenter as a critical friend.

I have friends that are talented presenters and I love having one of them in the room for trial runs, with a view of being critical. Someone that says “yeah that was great” is nice. Someone that says “you lost me in the second part of the demo” or “I think the third demo would work better if you…” is what you need. Be prepared to do the same for them.

6. Record the demos.

When presenting at large events, I have a series of screen shots saved on a USB key, and I also have a full video walkthrough of each of the demos. I’m determined that the audience will get to see the demos.

As a simple example of when this has saved me, two years ago for TechEd Australia, I was presenting some Azure-related sessions. The Azure folk had decided to do maintenance and take things offline right in the middle of the event.

I told the audience, switched across to the videos of each demo, which I did a live voice-over for, and I suspect that many of them would have quickly forgotten that they were watching a video. By comparison, I attended several other Azure-related sessions at that same event and watched presenter after presenter stumbling when things would not work. You need a fall back plan.

Hint: Don’t just play the video with voice, etc. as well though – make it still pretty much a live thing. I’ve seen sessions where people just play a video with sound and it often looks like they could never have actually done the demo, particularly if it’s someone else’s voice.

7. Don’t try to debug live.

Unless it’s an obvious and trivial issue, you will do far more damage by trying to debug it. Attendees hate watching you stuff around trying to fix issues. You might feel great *if* you ever get it solved but you will have messed up your timing and possibly looked really, really bad in the meantime. And if you can’t solve it, you will have really messed up. Instead, move on and use your backup plan.

Isn’t this what everyone does?

It seems pretty basic to do these things but time and again, I see the opposite happening even at major events. I watched AzureConf the other day, and even the keynote had these issues. Having been involved in event keynotes and knowing what level of rehearsal normally goes into them, I can just imagine the discussions that went on later. They wouldn’t have been pretty.

You can avoid being one of these statistics with just a bit of planning, and you’ll already be ahead of the pack.

14 Apr 18:29

Improved clustered columnstore index rebuild–potential segment fullness

by Greg Low

Earlier, I posted a maintenance script that could be used with clustered columnstore indexes to try to determine when they should be rebuilt. Reorganize is not a very useful operation for these indexes as it basically just forces a close of the existing delta store, ready for compression.

One of the problems with a clustered columnstore index is that it doesn’t work as well if the segments have been created at less than the maximum potential size of 1048576 rows per segment. Over use of reorganize options could cause this, but more likely it will be to do with loading the index with small batches of data, rather than loading large batches in BULK INSERT operations.

I’ve updated the maintenance script so that you can now also choose to rebuild if your segments are much less full than their optimal size ie: you could have 1,048,560,000 rows of data in 10,000 segments. So if you say 90 percent as the cutoff and your segments hold less than 90 percent of their potential, that will also trigger a rebuild.

Hope this helps someone:

----------------------------------------------------------------------

-- Rebuild clustered columnstore indexes when necessary

-- Dr Greg Low v2.0

----------------------------------------------------------------------

-- rebuild when more than supplied % of rows have been deleted

-- rebuild when any segments contain more than supplied % deleted rows

-- rebuild if more than supplied number of segments are empty

-- rebuild when segments are on average less than the supplied % of

-- maximum capacity

DECLARE @DeletedTotalRowPercentage int = 10;

DECLARE @DeletedSegmentsRowPercentage int = 20;

DECLARE @EmptySegmentsAllowed int = 0;

DECLARE @SegmentPotentialFullnessPercentage int = 90;

----------------------------------------------------------------------

DECLARE @IndexesToRebuild TABLE (SchemaName sysname,

TableName sysname,

IndexName sysname);

WITH ClusteredColumnstoreIndexes

( SELECT t.object_id AS ObjectID,

SCHEMA_NAME(t.schema_id) AS SchemaName,

t.name AS TableName,

i.name AS IndexName

FROM sys.indexes AS i

INNER JOIN sys.tables AS t

ON i.object_id = t.object_id

WHERE i.type = 5

RowGroups

( SELECT csrg.object_id AS ObjectID,

csrg.total_rows AS TotalRows,

csrg.deleted_rows AS DeletedRows,

csrg.deleted_rows * 100.0 / csrg.total_rows AS DeletedPercentage,

CASE WHEN csrg.total_rows = csrg.deleted_rows

THEN 1 ELSE 0

END AS IsEmptySegment

FROM sys.column_store_row_groups AS csrg

IndexStats

( SELECT cci.ObjectID,

cci.SchemaName,

cci.TableName,

cci.IndexName,

SUM(CAST(rg.TotalRows AS decimal(18,0))) AS TotalRows,

SUM(CAST(rg.DeletedRows AS decimal(18,0))) AS DeletedRows,

SUM(CAST(rg.DeletedRows AS decimal(18,0))) * 100.0

/ SUM(CAST(rg.TotalRows AS decimal(18,0)))
AS DeletedPercentage,

SUM(rg.IsEmptySegment) aS EmptySegments,

COUNT(rg.TotalRows) AS TotalSegments

FROM ClusteredColumnstoreIndexes AS cci

INNER JOIN RowGroups AS rg

ON cci.ObjectID = rg.ObjectID

GROUP BY cci.ObjectID, cci.SchemaName, cci.TableName, cci.IndexName

)

INSERT @IndexesToRebuild (SchemaName, TableName, IndexName)

SELECT s.SchemaName, s.TableName, s.IndexName

FROM IndexStats AS s

WHERE s.DeletedPercentage > @DeletedTotalRowPercentage

OR s.EmptySegments > @EmptySegmentsAllowed

OR EXISTS(SELECT 1 FROM RowGroups AS rg

WHERE rg.ObjectID = s.ObjectID

AND rg.DeletedPercentage > @DeletedSegmentsRowPercentage)

OR (s.TotalRows * 100 / (s.TotalSegments * 1048576.0))
< @SegmentPotentialFullnessPercentage;

DECLARE @SchemaName sysname;

DECLARE @TableName sysname;

DECLARE @IndexName sysname;

DECLARE @SqlCommand nvarchar(max);

DECLARE IndexList CURSOR FAST_FORWARD READ_ONLY

FOR

SELECT SchemaName, TableName, IndexName

FROM @IndexesToRebuild

ORDER BY SchemaName, TableName, IndexName;

OPEN IndexList;

FETCH NEXT FROM IndexList INTO @SchemaName, @TableName, @IndexName;

WHILE @@FETCH_STATUS = 0

BEGIN

SET @SqlCommand = N'ALTER INDEX ' + QUOTENAME(@IndexName)

+ N' ON ' + QUOTENAME(@SchemaName)

+ N'.' + QUOTENAME(@TableName)

+ N' REBUILD PARTITION = ALL;';

PRINT @SqlCommand;

EXEC(@SqlCommand);

FETCH NEXT FROM IndexList INTO @SchemaName, @TableName, @IndexName;

END;

CLOSE IndexList;

DEALLOCATE IndexList;

14 Apr 18:29

JDBC: This driver is not configured for integrated authentication

by Adam W. Saxton

I’ve had about 4 cases in the last two months that centered around the following error when trying to use Windows Integrated authentication with JDBC.

java.sql.SQLException: This driver is not configured for integrated authentication

The key environment point on this was that they were trying to do this on a Linux platorm and not a Windows platform. Specifically they were running WebSphere on a Linux platform. The last one I worked on was running WebSphere 8.5.

There is only one location within the JDBC Driver where this particular error is raised. It is when we are trying to use Kerberos and the Authentication Scheme is set to NativeAuthentication, which is the default setting for this property. Starting in the JDBC 4.0 driver, you can use the authenticationScheme connection property to indicate how you want to use Kerberos to connect to SQL. There are two settings here.

NativeAuthentication (default) – This uses the sqljdbc_auth.dll and is specific to the Windows platform. This was the only option prior to the JDBC 4.0 driver.

JavaKerberos – Makes use of the Java API’s to invoke kerberos and does not rely on the Windows Platform. This is java specific and not bound to the underlying operating system, so this can be used on both Windows and Linux platforms.

So, if you are receiving the error above, there are three possibilities that could be causing it to show up. First, you actually specified authenticationScheme=NativeAuthentication in your connection string and you are on a Linux Platform. Second, you specified integratedSecurity and omitted authenticationScheme, which defaulted to NativeAuthentication, and you are on a Unix/Linux Platform. Third, you are using a version of the JDBC Driver prior to the 4.0 driver and trying to use Integrated Authentication on a Unix/Linux platform. In the third case, even if you specify authenticationScheme=JavaKerberos, it won’t help as the older drivers aren’t aware of it, so it is ignored.

The following document outlines how to use Kerberos with the JDBC Driver and walks through what is needed to get JavaKerberos working properly.

Using Kerberos Integrated Authentication to Connect to SQL Server
http://msdn.microsoft.com/en-us/library/gg558122%28v=sql.110%29.aspx

Another aspect that was discovered was the that it appears that the WebSphere 8.5 release comes with the 3.0 version of the SQL JDBC Driver. This will not honor the JavaKerberos setting and you will get the error listed above.

Configuration

So, you will need to make sure your driver is updated to the 4.0 driver or later. After that is done, you will need to make sure that the Kerberos Configuration file (krb5.ini or krb5.conf) is configured properly on your platform. In the above referenced documentation we have a sample of what that should look like. You will also need to generate keytab files for the platform to reference. A login configuration file also needs to be setup. If you don’t have one, the driver will automatically configure it using the Krb5LoginModule. If you need to use a different Login Module, you will need to make sure that is configured for your environment. Assuming all of that is in place, the driver should work using JavaKerberos to connect.

The following blog does a good job of walking through the steps to get this setup for Java. It indicates Weblogic, but really it just goes through the java aspects. It walks through how to create the keytab files and what to do with the krb5.ini file.

Configure Kerberos with Weblogic Server (really just a Java reference)
https://blogbypuneeth.wordpress.com/configure-kerberos-with-weblogic-server/

Known Limitation

If you have a multiple domain environment with SQL Servers in different domains that you are trying to hit, you will run into issues. We found that in order to get it to work properly, you need to set the default domain within the Kerberos configuration file, to the domain that the SQL Server resides in. You can only have one default domain, so if you have multiple SQL Servers in different domains, you are going to have to pick one.

SQL JDBC Driver Versioning and Files

I’ve also heard a lot of questions and seen confusion on the file versioning, file name and system requirements. Here is a table where I tried to highlight what comes with what driver for reference.

JDBC Driver Version

JAR Files

JDBC API Support

Supported JVM

2.0

sqljdbc.jar

3.0

1.5

sqljdbc4.jar

4.0

1.6 or later

3.0

sqljdbc.jar

3.0

1.5

sqljdbc4.jar

4.0

1.6 or later

4.0

sqljdbc.jar

3.0

1.5

sqljdbc4.jar

4.0

1.6 or later

4.1

sqljdbc.jar

3.0

1.5

sqljdbc4.jar

4.0

1.6 or later

sqljdbc41.jar

4.0

1.7 or later

Also, we have documentation regarding the System Requirements that you can look at that goes a little further into this.

System Requirements for the JDBC Driver
http://msdn.microsoft.com/en-us/library/ms378422(v=sql.110).aspx

Hopefully this will help clear things up for you when using the SQL JDBC Driver on a Unix/Linux Platform.

Adam W. Saxton | Microsoft SQL Server Escalation Services
http://twitter.com/awsaxton

14 Apr 18:28

Pagination with OFFSET / FETCH : A better way

by Aaron Bertrand

Pagination is a common use case throughout client and web applications everywhere. Google shows you 10 results at a time, your online bank may show 20 bills per page, and bug tracking and source control software might display 50 items on the screen.

Based on the indexing of the table, the columns needed, and the sort method chosen, paging can be relatively painless. If you're looking for the "first" 20 customers and the clustered index supports that sorting (say, a clustered index on an IDENTITY column or DateCreated column), then the query is going to be pretty efficient. If you need to support sorting that requires non-clustered indexes, and especially if you have columns needed for output that aren't covered by the index (never mind if there is no supporting index), the queries can get more expensive. And even the same query (with a different @PageNumber parameter) can get much more expensive as the @PageNumber gets higher – since more reads may be required to get to that "slice" of the data.

Some will say that progressing toward the end of the set is something that you can solve by throwing more memory at the problem (so you eliminate any physical I/O) and/or using application-level caching (so you're not going to the database at all). Let's assume for the purposes of this post that more memory isn't always possible, since not every customer can add RAM to a server that's out of memory slots, or just snap their fingers and have newer, bigger servers ready to go. Especially since some customers are on Standard Edition, so are capped at 64GB (SQL Server 2012) or 128GB (SQL Server 2014), or are using even more limited editions such as Express (1GB) or whatever they're calling Azure SQL Database this week (many different servicing tiers).

So I wanted to look at the common paging approach on SQL Server 2012 – OFFSET / FETCH – and suggest a variation that will lead to more linear paging performance across the entire set, instead of only being optimal at the beginning. Which, sadly, is all that a lot of shops will test.

Setup

I'm going to borrow from a recent post, Bad habits : Focusing only on disk space when choosing keys, where I populated the following table with 1,000,000 rows of random-ish (but not entirely realistic) customer data:

CREATE TABLE [dbo].[Customers_I]
(
  [CustomerID] [int] IDENTITY(1,1) NOT NULL,
  [FirstName] [nvarchar](64) NOT NULL,
  [LastName] [nvarchar](64) NOT NULL,
  [EMail] [nvarchar](320) NOT NULL,
  [Active] [bit] NOT NULL DEFAULT ((1)),
  [Created] [datetime] NOT NULL DEFAULT (sysdatetime()),
  [Updated] [datetime] NULL,
  CONSTRAINT [C_PK_Customers_I] PRIMARY KEY CLUSTERED ([CustomerID] ASC)
);
GO
CREATE NONCLUSTERED INDEX [C_Active_Customers_I] 
  ON [dbo].[Customers_I]
  ([FirstName] ASC, [LastName] ASC, [EMail] ASC)
  WHERE ([Active] = 1);
GO
CREATE UNIQUE NONCLUSTERED INDEX [C_Email_Customers_I] 
  ON [dbo].[Customers_I]
  ([EMail] ASC);
GO
CREATE NONCLUSTERED INDEX [C_Name_Customers_I] 
  ON [dbo].[Customers_I]
  ([LastName] ASC, [FirstName] ASC)
  INCLUDE ([EMail]);
GO

Since I knew I would be testing I/O here, and would be testing from both a warm and cold cache, I made the test at least a little bit more fair by rebuilding all of the indexes to minimize fragmentation (as would be done less disruptively, but regularly, on most busy systems that are performing any type of index maintenance):

ALTER INDEX ALL ON dbo.Customers_I REBUILD WITH (ONLINE = ON);

After the rebuild, fragmentation comes in now at 0.05% – 0.17% for all indexes (index level = 0), pages are filled over 99%, and the row count / page count for the indexes are as follows:

Index	Page Count	Row Count
C_PK_Customers_I (clustered index)	19,210	1,000,000
C_Email_Customers_I	7,344	1,000,000
C_Active_Customers_I (filtered index)	13,648	815,235
C_Name_Customers_I	16,824	1,000,000

Indexes, page counts, row counts

This obviously isn't a super-wide table, and I've left compression out of the picture this time. Perhaps I will explore more configurations in a future test.

Paging Scenarios

Typically, users will formulate a paging query like this (I'm going to leave the old-school, pre-2012 methods out of this post):

SELECT [a_bunch_of_columns] 
  FROM dbo.[some_table]
  ORDER BY [some_column_or_columns] 
  OFFSET @PageSize * (@PageNumber - 1) ROWS
  FETCH NEXT @PageSize ROWS ONLY;

As I mentioned above, this works just fine if there is an index that supports the ORDER BY and that covers all of the columns in the SELECT clause (and, for more complex queries, the WHERE and JOIN clauses). However, the sort costs might be overwhelming with no supporting index, and if the output columns aren't covered, you will either end up with a whole bunch of key lookups, or you may even get a table scan in some scenarios.

Let's get more specific. Given the table and indexes above, I wanted to test these scenarios, where we want to show 100 rows per page, and output all of the columns in the table:

Default – ORDER BY CustomerID (clustered index).
Phone book – ORDER BY LastName, FirstName (supporting non-clustered index).
User-defined – ORDER BY FirstName DESC, EMail (no supporting index).

I wanted to test these methods and compare plans and metrics when – under both warm cache and cold cache scenarios – looking at page 1, page 500, page 5,000, and page 9,999. So I created these procedures (differing only by the ORDER BY clause):

CREATE PROCEDURE dbo.Pagination_Test_1 -- ORDER BY CustomerID
  @PageNumber INT = 1,
  @PageSize   INT = 100
AS
BEGIN
  SET NOCOUNT ON;
 
  SELECT CustomerID, FirstName, LastName,
      EMail, Active, Created, Updated
    FROM dbo.Customers_I
    ORDER BY CustomerID
    OFFSET @PageSize * (@PageNumber - 1) ROWS
    FETCH NEXT @PageSize ROWS ONLY OPTION (RECOMPILE);
END
GO
 
CREATE PROCEDURE dbo.Pagination_Test_2 -- ORDER BY LastName, FirstName
CREATE PROCEDURE dbo.Pagination_Test_3 -- ORDER BY FirstName DESC, EMail

In reality, you will probably just have one procedure that either uses dynamic SQL or a CASE expression to dictate the order. In either case, you may see best results by using OPTION (RECOMPILE) on the query to avoid reuse of plans that are optimal for one sorting option but not all. I created separate procedures here to take those variables away; I added OPTION (RECOMPILE) for these tests to stay away from parameter sniffing and other optimization issues without flushing the entire plan cache repeatedly.

An alternate approach

A slightly different approach, which I don't see implemented very often, is to locate the "page" we're on using only the clustering key, and then join to that:

;WITH pg AS 
(
  SELECT [key_column] 
  FROM dbo.[some_table]
  ORDER BY [some_column_or_columns] 
  OFFSET @PageSize * (@PageNumber - 1) ROWS
  FETCH NEXT @PageSize ROWS ONLY
)
SELECT t.[bunch_of_columns]
  FROM dbo.[some_table] AS t
  INNER JOIN pg ON t.[key_column] = pg.[key_column] -- or EXISTS
  ORDER BY [some_column_or_columns];

It's more verbose code, of course, but hopefully it's clear what SQL Server can be coerced into doing: avoiding a scan, or at least deferring lookups until a much smaller resultset is whittled down. Paul White (@SQL_Kiwi) investigated a similar approach back in 2010, before OFFSET/FETCH was introduced in the early SQL Server 2012 betas (I first blogged about it later that year).

Given the scenarios above, I created three more procedures, with the only difference between the column(s) specified in the ORDER BY clauses (we now need two, one for the page itself, and one for ordering the result):

CREATE PROCEDURE dbo.Alternate_Test_1 -- ORDER BY CustomerID
  @PageNumber INT = 1,
  @PageSize   INT = 100
AS
BEGIN
  SET NOCOUNT ON;
 
  ;WITH pg AS
  (
    SELECT CustomerID
      FROM dbo.Customers_I
      ORDER BY CustomerID
      OFFSET @PageSize * (@PageNumber - 1) ROWS
      FETCH NEXT @PageSize ROWS ONLY
  )
  SELECT c.CustomerID, c.FirstName, c.LastName,
      c.EMail, c.Active, c.Created, c.Updated
  FROM dbo.Customers_I AS c
  WHERE EXISTS (SELECT 1 FROM pg WHERE pg.CustomerID = c.CustomerID)
  ORDER BY c.CustomerID OPTION (RECOMPILE);
END
GO
 
CREATE PROCEDURE dbo.Alternate_Test_2 -- ORDER BY LastName, FirstName
CREATE PROCEDURE dbo.Alternate_Test_3 -- ORDER BY FirstName DESC, EMail

Note: This may not work so well if your primary key is not clustered – part of the trick that makes this work better, when a supporting index can be used, is that the clustering key is already in the index, so a lookup is often avoided.

Testing the clustering key sort

First I tested the case where I didn't expect much variance between the two methods – sorting by the clustering key. I ran these statements in a batch in SQL Sentry Plan Explorer and observed duration, reads, and the graphical plans, making sure that each query was starting from a completely cold cache:

SET NOCOUNT ON;
-- default method
DBCC DROPCLEANBUFFERS;
EXEC dbo.Pagination_Test_1 @PageNumber = 1;
DBCC DROPCLEANBUFFERS;
EXEC dbo.Pagination_Test_1 @PageNumber = 500;
DBCC DROPCLEANBUFFERS;
EXEC dbo.Pagination_Test_1 @PageNumber = 5000;
DBCC DROPCLEANBUFFERS;
EXEC dbo.Pagination_Test_1 @PageNumber = 9999;
 
-- alternate method
DBCC DROPCLEANBUFFERS;
EXEC dbo.Alternate_Test_1 @PageNumber = 1;
DBCC DROPCLEANBUFFERS;
EXEC dbo.Alternate_Test_1 @PageNumber = 500;
DBCC DROPCLEANBUFFERS;
EXEC dbo.Alternate_Test_1 @PageNumber = 5000;
DBCC DROPCLEANBUFFERS;
EXEC dbo.Alternate_Test_1 @PageNumber = 9999;

The results here were not astounding. Over 5 executions the average number of reads are shown here, showing negligible differences between the two queries, across all page numbers, when sorting by the clustering key:

The plan for the default method (as shown in SQL Sentry Plan Explorer) in all cases was as follows:

While the plan for the CTE-based method looked like this:

Now, while I/O was the same regardless of caching (just a lot more read-ahead reads in the cold cache scenario), I measured the duration with a cold cache and also with a warm cache (where I commented out the DROPCLEANBUFFERS commands and ran the queries multiple times before measuring). These durations looked like this:

While you can see a pattern that shows duration increasing as the page number gets higher, keep the scale in mind: to hit rows 999,801 -> 999,900, we're talking half a second in the worst case and 118 milliseconds in the best case. The CTE approach wins, but not by a whole lot.

Testing the phone book sort

Next, I tested the second case, where the sorting was supported by a non-covering index on LastName, FirstName. The query above just changed all instances of Test_1 to Test_2. Here were the reads using a cold cache:

(The reads under a warm cache followed the same pattern – the actual numbers differed slightly, but not enough to justify a separate chart.)

When we're not using the clustered index to sort, it is clear that the I/O costs involved with the traditional method of OFFSET/FETCH are far worse than when identifying the keys first in a CTE, and pulling the rest of the columns just for that subset.

Here is the plan for the traditional query approach:

And the plan for my alternate, CTE approach:

Finally, the durations:

The traditional approach shows a very obvious upswing in duration as you march toward the end of the pagination. The CTE approach also shows a non-linear pattern, but it is far less pronounced and yields better timing at every page number. We see 117 milliseconds for the second-to-last page, versus the traditional approach coming in at almost two seconds.

Testing the user-defined sort

Finally, I changed the query to use the Test_3 stored procedures, testing the case where the sort was defined by the user and did not have a supporting index. The I/O was consistent across each set of tests; the graph is so uninteresting, I'm just going to link to it. Long story short: there were a little over 19,000 reads in all tests. The reason is because every single variation had to perform a full scan due to the lack of an index to support the ordering. Here is the plan for the traditional approach:

And while the plan for the CTE version of the query looks alarmingly more complex…

…it actually leads to lower durations in all but one case. Here are the durations:

You can see that we can't get linear performance here using either method, but the CTE does come out on top by a good margin (anywhere from 16% to 65% better) in every single case except the cold cache query against the first page (where it lost by a whopping 8 milliseconds). Also interesting to note that the traditional method isn't helped much at all by a warm cache in the "middle" (pages 500 and 5000); only toward the end of the set is any efficiency worth mentioning.

Higher volume

After individual testing of a few executions and taking averages, I thought it would also make sense to test a high volume of transactions that would somewhat simulate real traffic on a busy system. So I created a job with 6 steps, one for each combination of query method (traditional paging vs. CTE) and sort type (clustering key, phone book, and unsupported), with a 100-step sequence of hitting the four page numbers above, 10 times each, and 60 other page numbers chosen at random (but the same for each step). Here is how I generated the job creation script:

SET NOCOUNT ON;
DECLARE @sql NVARCHAR(MAX), @job SYSNAME = N'Paging Test', @step SYSNAME, @command NVARCHAR(MAX);
 
;WITH t10 AS (SELECT TOP (10) number FROM master.dbo.spt_values),
f AS (SELECT f FROM (VALUES(1),(500),(5000),(9999)) AS f(f))
SELECT @sql = STUFF((SELECT CHAR(13) + CHAR(10) 
  + N'EXEC dbo.$p$_Test_$v$ @PageNumber = ' + RTRIM(f) + ';'
  FROM
  (
    SELECT f FROM
    (
      SELECT f.f FROM t10 CROSS JOIN f 
      UNION ALL 
      SELECT TOP (60) f = ABS(CHECKSUM(NEWID())) % 10000 
	    FROM sys.all_objects
    ) AS x
  ) AS y ORDER BY NEWID() 
  FOR XML PATH(''),TYPE).value(N'.[1]','nvarchar(max)'),1,0,'');
 
IF EXISTS (SELECT 1 FROM msdb.dbo.sysjobs WHERE name = @job)
BEGIN
  EXEC msdb.dbo.sp_delete_job @job_name = @job;
END
 
EXEC msdb.dbo.sp_add_job
  @job_name = @job, 
  @enabled = 0, 
  @notify_level_eventlog = 0, 
  @category_id = 0, 
  @owner_login_name = N'sa';
 
EXEC msdb.dbo.sp_add_jobserver
  @job_name = @job,
  @server_name = N'(local)';
 
DECLARE c CURSOR LOCAL FAST_FORWARD FOR
SELECT step = p.p + '_' + v.v, 
    command = REPLACE(REPLACE(@sql, N'$p$', p.p), N'$v$', v.v)
  FROM 
  (SELECT v FROM (VALUES('1'),('2'),('3')) AS v(v)) AS v
  CROSS JOIN 
  (SELECT p FROM (VALUES('Alternate'),('Pagination')) AS p(p)) AS p
  ORDER BY p.p, v.v;
 
OPEN c; FETCH c INTO @step, @command;
 
WHILE @@FETCH_STATUS <> -1
BEGIN
  EXEC msdb.dbo.sp_add_jobstep
    @job_name   = @job,
    @step_name  = @step,
    @command    = @command,
    @database_name = N'IDs',
    @on_success_action = 3;
 
  FETCH c INTO @step, @command;
END
 
EXEC msdb.dbo.sp_update_jobstep
  @job_name = @job, 
  @step_id  = 6, 
  @on_success_action = 1; -- quit with success
 
PRINT N'EXEC msdb.dbo.sp_start_job @job_name = ''' + @job + ''';';

Here is the resulting job step list and one of the step's properties:

I ran the job five times, then reviewed the job history, and here were the average runtimes of each step:

I also correlated one of the executions on the SQL Sentry Event Manager calendar…

…with the SQL Sentry Performance Advisor dashboard, and manually marked roughly where each of the six steps ran. Here is the CPU usage chart from the Windows side of the dashboard:

And from the SQL Server side of the dashboard, the interesting metrics were in the Key Lookups and Waits graphs:

The most interesting observations just from a purely visual perspective:

CPU is pretty hot, at around 80%, during step 3 (CTE + no supporting index) and step 6 (traditional + no supporting index);
CXPACKET waits are relatively high during step 3 and to a lesser extent during step 6;
you can see the massive jump in key lookups, to almost 600,000, in about a one-minute span (correlating to step 5 – the traditional approach with a phone book-style index).

In a future test – as with my previous post on GUIDs – I'd like to test this on a system where the data doesn't fit into memory (easy to simulate) and where the disks are slow (not so easy to simulate), since some of these results probably do benefit from things not eveyr production system has – fast disks and sufficient RAM. I also should expand the tests to include more variations (using skinny and wide columns, skinny and wide indexes, a phone book index that actually covers all of the output columns, and sorting in both directions). Scope creep definitely limited the extent of my testing for this first set of tests.

Conclusion

Pagination doesn't always have to be painful; SQL Server 2012 certainly makes the syntax easier, but if you just plug the native syntax in, you might not always see a great benefit. Here I have shown that slightly more verbose syntax using a CTE can lead to much better performance in the best case, and arguably negligible performance differences in the worst case. By separating data location from data retrieval into two different steps, we can see a tremendous benefit in some scenarios, outside of higher CXPACKET waits in one case (and even then, the parallel queries finished faster than the other queries displaying little or no waits, so they were unlikely to be the "bad" CXPACKET waits everyone warns you about).

Still, even the faster method is slow when there is no supporting index. While you may be tempted to implement an index for every possible sorting algorithm a user might choose, you may want to consider providing fewer options (since we all know that indexes aren't free). For example, does your application absolutely need to support sorting by LastName ascending *and* LastName descending? If they want to go directly to the customers whose last names start with Z, can't they go to the *last* page and work backward? That's a business and usability decision more than a technical one, just keep it as an option before slapping indexes on every sort column, in both directions, in order to get the best performance for even the most obscure sorting options.

The post Pagination with OFFSET / FETCH : A better way appeared first on SQLPerformance.com.

14 Apr 18:28

APS / PDW Monitoring

by Rob Farley

When you get a Analytics Platform System (APS) – the appliance that hosts a copy SQL Server Parallel Data Warehouse Edition (EDW) and potentially a Hadoop cluster as well – one of the things that you get is a Management Console. It’s quite handy to be able to look at it and see if your appliance is healthy or not.

The trouble with the Management Console, though, is what I’ve just said – you look at it to assess the health of your appliance. And this isn’t something that you really want to do. What are you going to do – look at it every morning and then get on with your day? If this is what you’re doing to monitor your environment, the chances are you won’t be watching in the moment you start to actually have a problem, and really you’ll be getting the alert when your phone rings, or at least after you’ve picked it up and got an earful of ‘angry customer’.

You need a better solution – something which is going to actually let you know if a problem is detected, and which is going to store some historical information so that you can do some troubleshooting. You know – just like you have on your other SQL boxes. The thing is that PDW doesn’t come with something which can send emails. And it doesn’t come with a scheduler like SQL Agent which can be used to kick off jobs to collect data, or to check perfmon counters in case there’s an issue.

So how do you monitor PDW?

The answer is: from outside it.

Ideally, your organisation has centralised monitoring anyway. Some server(s) that run around checking that everything is healthy. They check the health of the Windows boxes, they see how much disk space is free on all the volumes, they make sure that backups have been taken (not just SQL backups, but backups of whole servers), and run tests like DBCC, restores, and pings. If there’s a problem with the monitoring servers, this is recognised and understood.

Now, you can roll your own one of these, and have it query DMVs like PDW’s sys.dm_pdw_component_health_alerts, sys.dm_pdw_errors, and sys.dm_pdw_os_performance_counters. Collecting this information (and the contents of many other DMVs) can help provide excellent coverage when troubleshooting, and also highlight potential issues before they arrive. Running DBCC PDW_SHOWSPACEUSED from time to time is definitely important, to be conscious of how skew is looking – the list of things to keep an eye on goes on and on.

Something that shouldn’t be overlooked is the usefulness of System Center Operations Manager (even if I keep wanting to type ‘Centre’ instead of ‘Center’). There are SCOM Management Packs available to cater for PDW, HDInsight (another component within APS) and APS itself. If SCOM is part of your organisation, then configuring it to monitor your APS appliance is definitely worth doing. I’ve lifted the image here from the APS help file – if you’re at all familiar with SCOM, you’ll recognise it and see that you have good coverage of your APS environment with it. It should never fully replace using queries to look at the metadata within (for assessing skew, etc.), but you should definitely be using SCOM with APS if you can.

I mentioned that this image is part of the APS help file – it goes into quite some detail about setting up SCOM to work with APS, so if you’re considering APS, you should be able to reassure your IT staff that they will be able to use their existing SCOM environment to monitor the appliance still.

Don’t neglect monitoring your APS box. When we get an appliance, it’s easy to let it just sit there and do its stuff, assuming that everything is going to be okay because it’s an appliance. We don’t monitor our kettles at home, but our businesses don’t depend on the health of the kettles (maybe the coffee machine, but that’s a different story). Monitoring doesn’t have to be hard work, but it does have to happen. Luckily, we get a bunch of tools to help us make that happen.

And this is for yet another T-SQL Tuesday. The first for 2015, and the 62nd in all – hosted this time by Robert Pearl (@pearlknows).

@rob_farley

14 Apr 18:28

SQL Server 2014 Upgrade Technical Guide

by Dejan Sarka

If you plan to upgrade to SQL Server 2014, then this technical guide is a must. With 429 pages, it is a complete book, yet still available as a free download here. It covers all products and features in the SQL Server suite, and upgrades from versions 2005, 2008, 2008 R2, and 2012. It supplements the information available in Books Online. Besides the actual upgrade, the white paper also covers planning and pre- and post-upgrade tasks.

This is the fourth upgrade technical guide I coauthored – all together, I was involved in versions 2008, 2008 R2, 2012, and 2014. This is the complete list of authors: Ron Talmage, Richard Waymire, James Miller, Vivek Tiwari, Ken Spencer, Paul Turley, Danilo Dominici, Dejan Sarka, Johan Åhlén, Nigel Sammy, Allan Hirt, Herbert Albert, Antonio Soto, Régis Baccaro, Milos Radivojević, Jesús Gil, Simran Jindal, Craig Utley, Larry Barnes, Pablo Ahumada. Thanks to everybody for a smooth process of writing and editing!

14 Apr 18:27

Demystifying Clustered Columnstore Indexes

by tlachev

Non-clustered columnstore indexes (NCCI) were introduced in SQL Server 2012 to improve the performance of large aggregate queries (common for data warehousing) with the caveat that there were read-only. Consequently, the ETL process has to drop NCCI, load the data, and recreate the columnstore index.

NOTE Building an index (columnstore or regular) should be a highly-parallel operation. Building a columnstore index in particular should max out all licensed cores. Currently, we have an open support case with Microsoft where a columnstore index allows adding a computed column (while it shouldn't). Consequently, SQL Server builds the index using a single thread which may lead to excessive index creation times.

SQL Server 2014 introduced clustered columnstore indexes (CCI) which offer two main advantages:

CCI is updatable -- Therefore, you don't have to drop and recreate the index anymore.
Storage is greatly reduced -- For example, you might have a fact table of 100 GB. Assuming x10 compression, your table space with NCCI will be 110 GB. However, if you create CCI the table space will be only 10 GB. This is why Analytics Platform System (previously known is PDW) uses CCI.

On the downside, CCI doesn't support any other indexes that you might need to optimize ETL or speed up joins. Basically, CCI is the only index you can have on a table. However, CCI is not designed for equality and short-range queries that are typical for detail-level SELECT or MERGE queries. It's a common misconception that you don't need such indexes with CCI but this is not the case. I hope a future release of SQL Server enhances CCI to support regular indexes as well.

14 Apr 18:27

Theory of Data…

by Karen Lopez

14 Apr 18:27

SQL Injection – Why the Heck Does this Keep Happening?!

by kevin

kkline10

I’m decidedly old-school in a few of my habits. My morning routine, barring anything out of the ordinary, is one such example. Typically, I’ll get up between 6:30 to 7:00 am, grab my first cup of coffee, and then chat with my daughters for a few minutes before seeing them off to school. Next, I make a bowl of oatmeal (a great choice for diabetics like me), pour a second cup of coffee, and browse the local paper, The Tennessean, while I have breakfast. On the morning of August 5, I had the added pleasure of spewing said coffee and oatmeal all over my morning paper when I read the headlines: Russian gang stole 1.2 billion Net passwords.

Scanning the article, midway through the details, I saw that the hackers used good, ol’ fashioned SQL injection attacks. They target approximately 450,000 websites from small mom-and-pop shops to the very largest company websites. A bit later, I read more online from The New York Times article where the story first broke.

Read the rest of the article at http://www.dbta.com/Columns/SQL-Server-Drill-Down/Russian-Hackers-Steal-12b-Internet-Credentials-Or-Why-the-Heck-Does-this-Keep-Happening!-99182.aspx

The post SQL Injection – Why the Heck Does this Keep Happening?! appeared first on Kevin Kline.

14 Apr 18:26

Did You Know: Sometimes I have to troubleshoot issues not dealing with SQL Server

by Kalen Delaney

In the last couple of weeks, I’ve had to troubleshoot several technical problems that, while not critical, were extremely frustrating, annoying and inconvenient. Two of them had to do with setting up my new computer. I am now using a Microsoft Surface Pro 3 as my main work computer. And mostly, I absolutely love it! However, there have been a couple of problems. 1. Getting the tablet into the docking station involved a whole lot of jiggling. This is the first time I’ve ever used a docking station...(read more)

14 Apr 18:26

Tip # 2 – Check your Security

by Chris Shaw

Tip # 2 – Check your Security

Top 10 Tips for SQL Server Performance and Resiliency

This article is Part 2 in a series on the Top 10 Most Common Mistakes: SQL Server Performance and Resiliency. Neglecting security is the second biggest mistake I frequently see impacting SQL Server’s performance. In case you have missed the prior posts:

#1 Most common mistake – Incomplete Backups

I can’t tell you how many times I have been told “We have a really good firewall so the internal security we have in place is not as important” or “We like to operate with a level of trust, and Chris, you need to learn how to trust people”. The second quote came from an IT Director from a public company. My response to the trust argument is that security has nothing to do with trust, security is about protecting the data from not only things that people may set out to do intentionally, but also protect the data from things people can do on accident.

I am not ashamed to admit I like to have my SQL Servers secure. I consider it not only my job, but a core responsibility in my career. If you have any doubt of how important security is today, just take a few moments and do a search on cyber-attacks. With recent headlines where major organizations are victims of cyber-attacks, hopefully this post becomes all that more important to you. Below are a few practices and procedures that can help secure your data, as well as accidental mistakes made by people without the proper knowledge of the database practices and procedures:

Not everyone should be a SA

The SA name is short for Systems Administrator. This account has permissions to everything inside SQL Server. When a user requests SA, they may be referring to a couple things. They may be asking for the password of the SA account or they may want you to assign their login to the Systems Administrator role. In my opinion, there are very few people who need this level of access to a SQL Server. Usually, these people have a job title that says, “Database Administrator”. Keep in mind: the more people that have this access level to your SQL Server, the more risk you take when it comes to how many people can make really big mistakes.

To be secure, make sure the SA account has a complex password. I recommend setting the SA password to a complex password and disabling the account to be a bit more secure. The administrator’s role should have their own accounts. Treat the SA account as hidden back door into your system and only use it when nothing else works.

Not everyone should be a DBO

In like fashion to SA access, not everyone needs DBO access to a database. DBO is short for database owner. Accounts with this access have permissions to everything internal on the database. This means they can delete users, tables, stored procedures and even data. Since SQL Server cannot make a determination between DELETE statements done on accident and on purpose, it becomes imperative that someone with DBO or SA access knows what they are doing.

I often see people treat security levels as a negotiation between two parties, almost like what you would experience when purchasing a car. The requestor asks for ‘SA’ level access and the DBA might respond by haggling them down to Data_Reader access, only to have the requestor push for something in the middle, like DBO access. As funny as this sounds, some DBA’s agree to it. The theory being that DBO is more restrictive than ‘SA’, so logically, security is intact. The end result in this hypothetical bargain is ‘DBO access’, which many people don’t need. Check the accounts and find out what is required to complete the work tasks. Don’t give employees any more room to break things than they need.

Securing your backups

The larger the organization, the more difficult public relations become after a data breach. The last thing any DBA wants to be part of is a letter to customers letting them know the data they are responsible for has been compromised.

Protect your databases, encrypt them with a password and manage the permissions on your backup locations like they contain your personal social security number. The protection of your database backups doesn’t stop there, make sure the offsite locations is secure as well.

Linked Servers

Linked servers can add tremendously to the ease of retrieving data from your SQL Servers. Often, people get pretty excited when they learn how easy it is to use and configure. However, a big mistake that I often see has to do with the security configuration of the link. When the linked server is configured, settings that should be set to protect the data on the server being linked to. I often see configurations that will allow a specific named user that may have Data_Reader on the primary server connect to the linked server using the SA account.

BUILTIN\Administrators

In older versions of SQL Server, if you had an administrator account on the windows machine such as local admin access, you would by default inherently have SA access to the SQL Server. This can be very helpful in small shops where a Windows Admin may also be a SQL Server Admin. The issue is sometimes the Windows Admin shouldn’t have that level of access to the data or configuration of SQL Server.

No Policy Management for SQL Server Accounts

SQL Server authenticated accounts, or accounts that reside only in SQL Server have the option to have the policy enforced by the Windows Policy. Sometimes these are configured this way because people don’t want to change the password for SQL Server, or have to worry about having a strong enough password. This might be related to the theory that if the server is behind a firewall it is protected enough. It is a good idea to have your passwords in SQL Server follow the same policy that your windows accounts follow.

This is just a short list of some of the common security mishaps I have encountered. There are others you should be aware of, like updating your service packs, staying on top of when people leave your company or auditing your group memberships. When you look at your database and you believe you have secured it enough, take another look. Consider creating a table called My Personal Info, and in that table insert your credit card, your debit card (don’t forget the P.I.N), your social security number and your banking information. Once that is all inserted there, how comfortable do you feel? Your data deserves to be secure and people every day are relying on you the data owner to keep it secure.

14 Apr 18:26

Grant full permission to Windows folders and files

by Haidong Ji

For a Windows application or service, it is often necessary for either the user or service startup account to have full control to its folders and files. In most cases, this is transparent to the user. It is generally being taken care of by various delegation mechanisms behind the scene. However, this can be a problem during application upgrade or migration, when one rearranges disk drive letters or mount point names, or moves things from one server to another. To resolve this, one could manually go to each file and folder property page through Windows Explorer, then the Security tab, then the Edit… button, then the Add… button, etc. This method isn’t too bad if only a handful of files and folders are involved. But it is easy to see that manual method gets old really quickly, is error prone, and does not scale.

For example, say you are doing SQL Server migration for a data warehouse instance, where there are multiple mount points and hundred of database files due to filegroup usage and table partitions. During the migration process, you’ll move them to a new location and attach those database files. To ensure SQL Server service SID has proper permission is no small task and we need an automated way.

Given this is Windows, I started searching around for a PowerShell solution. You’d think this is a common problem, and therefore it shouldn’t be too difficult to find a solution. But that’s not the case so I wrote my own. Have fun with it!

Notes:

Make sure you start PowerShell with “Start as Administrator”. This is to get around the issue that the creator and permission granter are different identities;
The first thing the script does is to give ownership to the Administrators group. During execution, you’ll see output like:
SUCCESS: The file (or folder): “x:\MountPoints\A1” now owned by the administrators group.
that is to be expected.
The script’s execution is recursive, meaning all sub-folders and files are covered.

Without further ado, here is the PowerShell function:

function grantFullPermission ($folderpath, $userAccount) {
    Invoke-Expression "takeown /f $folderpath /r /a /d Y"
    $aclWork = (Get-Item $folderPath).GetAccessControl("Access")
    $ssaf = New-Object system.Security.AccessControl.FileSystemAccessRule($userAccount, "FullControl", "ContainerInherit, ObjectInherit", "None", "Allow")
    $aclWork.SetAccessrule($ssaf)
    Set-ACL $folderPath $aclWork }

As an example, to grant proper permission to 3 folders listd in the code, you can run the following script as administrator:

function grantFullPermission ($folderpath, $userAccount) {
    Invoke-Expression "takeown /f $folderpath /r /a /d Y"
    $aclWork = (Get-Item $folderPath).GetAccessControl("Access")
    $ssaf = New-Object system.Security.AccessControl.FileSystemAccessRule($userAccount, "FullControl", "ContainerInherit, ObjectInherit", "None", "Allow")
    $aclWork.SetAccessrule($ssaf)
    Set-ACL $folderPath $aclWork }

$folders = @("x:\MountPoints\A1", "x:\MountPoints\A2\", "x:\MountPoints\B1")

$folders | foreach {
    grantFullPermission $_ 'NT SERVICE\MSSQLSERVER' }

You could also make this a short program that takes two parameters: a folder and an account. Remember to quote the parameters if they have space character.

param(
[parameter(Mandatory=$true)]$DirectoryPath,
[parameter(Mandatory=$true)]$AccountName)

function grantFullPermission ($folderpath, $userAccount) {
    Invoke-Expression "takeown /f $folderpath /r /a /d Y"
    $aclWork = (Get-Item $folderPath).GetAccessControl("Access")
    $ssaf = New-Object system.Security.AccessControl.FileSystemAccessRule($userAccount, "FullControl", "ContainerInherit, ObjectInherit", "None", "Allow")
    $aclWork.SetAccessrule($ssaf)
    Set-ACL $folderPath $aclWork }

grantFullPermission $DirectoryPath $AccountName

21 Jan 20:33

Stalled for Time

by Ellis Morning

Drew worked for a petite e-commerce company, where his primary responsibility was to keep everything running. He saw to day-to-day maintenance, as well as backup and disaster recovery.

The small shop’s resources were limited. Critical data remained tied up in onsite legacy systems- and of course, the “supply closet” was full of relics from a clunkier, floppier era. Drew did the best he could to shield the company from server outages, power outages, backup failures, and other common scenarios.

Then one day came the emergency neither Drew nor anyone else had ever anticipated…

A “resolved” dispute with a vendor from the previous year had abruptly evolved into “Pay us £15k immediately.” Drew’s company had agreed to pay that coming Friday, when they’d actually have the funds. It was only Wednesday. Nevertheless, bailiffs had pulled up outside of the office suite where the company was based, with orders to confiscate “all computer-related items.”

Drew’s boss called him from reception, and quickly explained the issue. “Hide the computers!” he finished with a hiss.

There were 7 workstations within the suite. Drew’s pulse began to race. “Where? The offices and closets are crammed full. I can’t take them out the front or the back without being spotted.”

“I don’t care- get creative!” his boss replied. “Those machines house all our email, code, and project deliverables. We’re dead in the water without them! I’ll stall them as long as I can- now go!” Click.

Drew had only moments to think, but desperation hatched an ingenious emergency backup strategy.

A bathroom stall on the 3rd floor became the business’ new center of operations; it was the only bathroom Drew had keys for. Reluctant to use the elevator for fear of running into the bailiffs, he lugged the desktops up the stairwell by hand, one or two at a time, piling them around a sensor-rigged toilet that awarded each deposit with a loud flush.

Once all the machines were secure, Drew taped an “out of order” sign to the stall door, and used a pair of scissors to lock the door from the outside.

Sweaty and out of breath, he stumbled back down to the office- but his work was far from over. If the bailiffs saw desks adorned with monitors and keyboards and mice attached to nothing, they’d surely get suspicious.

Again, hurried inspiration provided the answer: the supply closet.

Within minutes, the office fell into an alternate reality where desktop technology had peaked in 1985. Drew was about to swap out the modern monitors for CRTs when the bailiffs finally barged in with a police escort- Drew’s red-faced, shouting boss on their heels.

Drew hurried back to his own desk and held his breath.

“What is this rubbish?” one bailiff wondered aloud.

“It’s what we have,” Drew’s boss returned defensively.

The bailiffs laid claim to everything in sight- including Drew’s own personal USB hubs.

“Hey, those are mine!” he cried.

"They’re connected to the computer, so they fall under the remit of items to collect,” the bailiff explained.

“I bought these with my own money. They don’t belong to the company. I’ll be damned if you take them!” Drew accidentally brandished the scissors he’d used to lock up the bathroom stall.

“Put those down, sir,” a nearby police officer requested. “They’re making me nervous.”

Drew and the other lower-level minions were sent to wait things out at the pub, because the UK is rather brilliant like that. In the end, the bailiffs decided hauling off a few hundred quid worth of junk wasn’t worth the trip, and wound up seizing nothing.

A great deal of celebration ensued, culminating in the boss agreeing to reward Drew with shares in the company. Afterward, Drew went back to the office to perform disaster recovery, so everyone could start work the next day as normal.

An office worker from a different suite was present in the mens’ room. Drew tore down the “out of order” sign, forced open the lock with scissors, and entered the stall. One noisy automatic flush later, he walked out with one of the desktops propped on his shoulder.

"Don’t ask,” Drew told the other gentleman.

"I…don’t want to know,” he replied.

[Advertisement] Manage IT infrastructure as code across all environments with Puppet. Puppet Enterprise now offers more control and insight, with role-based access control, activity logging and all-new Puppet Apps. Start your free trial today!

Hydraulic.druid, Sam Dineley likes this

21 Jan 20:27

CodeSOD: Today is Special

by TJ Mott

march-2011-whatever Shawn's company needed a critical piece of software developed ASAP. The development team estimated six weeks of time to build it from scratch, but those in charge felt that was far too excessive. The devs had a track record of missing features--"Management NEVER told us we needed that feature!", botched rollouts--"What?! You told $newClient their site would go live on Monday? The product won't be ready until August and it's only February!", and going over-budget--"We developed it for SQL Server Express, I don't know why $phb bought us an enterprise Oracle instance."

Against the objections of everyone who did the actual work within the company, management partnered with the local University to have graduate students, most of whom had zero experience outside of academia, develop the new system for internship credit. Unpaid, of course. The plan then was that Shawn would spend "an hour or two" to polish the code up a bit and get it going in production.

After a full two semesters of development, Shawn finally received the code from the intern team and began preparing it for production. While attempting to get index.php to load without blowing up, he found this representative line that perfectly describes the overall code quality:

$closeDate = date( 'Y-m-d', strtotime(date("Y-m-d", strtotime($currentDate)) ));

After a week of analysis in which Shawn, unsurprisingly, found horrible coding standards, SQL injection vulnerabilities with every page, and features which were only vaguely similar to those outlined in the product requirements, he sent his estimate up the chain of command. It would take four to six weeks to fix the student-quality code enough to be production-worthy and deploy it.

The next day, he learned in a meeting that the company wouldn't afford the time to fix it. Instead, they would partner with the local University and hire a new team of unpaid interns to fix the system up.

[Advertisement] Use NuGet or npm? Check out ProGet, the easy-to-use package repository that lets you host and manage your own personal or enterprise-wide NuGet feeds and npm repositories. It's got an impressively-featured free edition, too!

21 Jan 20:24

Error'd: For Those Who Insist on Zapping Their Tongues

by Mark Bowytz

"Cool! Amazon now apparently offers a battery that comes in various flavors to enhance the experience," writes Evan C.

Niels R. wrote, "Finally. A website for when you absolutely, positively need a mouse that rates 2.1428571428 stars out of 5."

"With many systems that get butchered over the days/weeks/months/years certain columns, once used solidly for single responsibility, get hacked and overtaken by sales people. So instead of just having an error message, they started to report on a confirmation status because the Status of OK wasn't enough. A Bit like when people press a button that is already lit. (You never know it really might be broken)," writes Simon M., "Anyway, I found this gem of a status update on one of the schedules."

"I sense quite a bit of irony in this error," wrote TJ S.

"I was looking at the specs of a DVI extender when I noticed that the operating humidity of the device was 'Mozilla 1.4 or later',"Peter W. writes, "I'm not really sure that the air conditioner in our server room can provide that."

"Just saw this latest build result on my CI server. Apparently the test results come and go as they please," Tom C. wrote.

"I followed an ad link in http://www.smithsonianmag.com/ with the promising headline 'Non-existent Code Doesn't Crash'," Vince writes, "Unfortunately I ended up with some real code which (though it didn't actually crash if I'm being fair) certainly didn't work."

Aaron wrote, "Woo hoo! Check it out. My phone has a direct link to the CLOUD, baby!,"

[Advertisement] BuildMaster is more than just an automation tool: it brings together the people, process, and practices that allow teams to deliver software rapidly, reliably, and responsibly. And it's incredibly easy to get started; download now and use the built-in tutorials and wizards to get your builds and/or deploys automated!

21 Jan 20:23

The Backend

by Xander R.

Andrei had just moved to a new country, and was looking for work. With the ideal job not immediately forthcoming, he jumped when an interesting opportunity opened at a small tech company: OldIsNewCo.

OldIsNewCo was one of the pioneers bridging the gap between old methods of communication (e.g. paper) and new methods of communication (i.e. the Internet), and was looking to overhaul their entire infrastructure. Their existing backend was built in C++, and according to The Big Boss: “C++ developers are expensive and hard to find. PHP developers are plentiful and cheap. Therefore, we need to re-implement everything in PHP.”
PHP/CPP icon mangled from the Crystal Icon Theme
This seemed fishy to Andrei right from the start, but he figured it would be a good opportunity to brush up on his C++ and PHP skills and experience, and “a job’s a job”. So he accepted the position and met with his development tour guide to start diving into the sources and seeing how the backend applications worked.

Or, he tried - only to quickly discover Problem #1: There wasn’t a complete version of the source code in any single location. The sources for the backend were spread out over different production machines (i.e. the machines running the compiled code). Because these machines had variations, building was done on each machine locally, since the code might not build on a different machine with different software or configuration. This contributed to Problem #2: Making modifications to some of the backend was done by editing code on the same machine where that component ran.

As Andrei wrapped his head around those concerns, he began to wonder how the developers tested their changes. His development tour guide responded: “Oh, we don’t have a testing environment. We just deploy the new version to the live servers.” Deploying to production consisted of: waiting until the night when the system was mostly free of clients, compiling the code on the same machine it would then run, and hoping for the best. If the new build crashed the next morning when users returned, they would just revert to the old build and repeat the attempt the next evening.

But project organization, testing and deployment were far from the only concerns. As he finally began to dig into the code and architecture itself, Andrei discovered that the existing C++ backend had a few problems of its own:

The core backend app simply operated on files - it read them in, processed the contents, and output data to a database and other machines via the network. But this was implemented in the most complicated way possible, with numerous multithreading issues that resulted in the deployed application crashing frequently
It crashed so frequently, in fact, that it was one employee’s main job to watch the servers, wait for the app to crash, and manually restart it. 
All database synchronization was done in the application’s own C++ code, instead of using the built-in synchronization tools that were part of the database engine (i.e. no database transactions or locks were used). Combined with poor database design, this part of the system would also overload and deadlock quickly. As such, it was another employee’s main job to monitor for this, kill all the processes, clean up the databases by hand (because they’d likely have incomplete or corrupt data), and manually restart everything. 
Access to the database itself had enough pieces hard-coded in the C++ code that each time anything database-related was modified the C++ code had to be searched and modified extensively too. And if everything that a change affected wasn’t updated, everything would crash the next day when the untested code was deployed to production. 
The database was supposed to be replicated, but actually was not. This was discovered when the main database server crashed and the replicated database responded with errors and missing data. 
To meet the “new requirements” of The Big Boss, many of the latest features were allegedly implemented in PHP code but were actually coded as a series of thousands of system calls from PHP. (Apparently no one knew how to perform file management or monitoring tasks from either C++ or PHP code.) 
The existing code-base was written using ancient libraries running on an already-set-to-be-imminently-discontinued version of a Linux distribution. So the servers couldn’t be upgraded without extensive rewrites anyway.

Shocked by the state of everything, Andrei decided to kick off his rewrite project on the right track, and work up a proposal to take a reasonable number of days to collect and centralize the sources, create development / test / deployment environments and scripts, and otherwise organize the existing project. But his development tour guide quickly took him aside:

“I made the same exact proposal, and it wasn’t just denied - it’s the reason I’m being let go in two weeks. According to the boss, I proposed: ‘doing nothing for 2 weeks’.”

It was then that Andrei learned the underlying reason for a lot of the problems: The Big Boss deemed useful only those developers who could meet his demands immediately. Any other proposed steps or actions were seen as potentially fireable “wastes of time”. Fellow employees were so afraid of The Big Boss that they gave up on all “time wasting” tasks, including organizational upkeep, sensible deployment or testing practices, or even sometimes coding things the proper way (because the deadline was whatever The Big Boss deemed doable).

So Andrei sighed and kept his head down. Piece by piece, he ripped out chunks of the old C++ code and replaced them with lightweight modules written in PHP that were considerably faster, simpler, smaller, and actually stable. When he had extra time within the deadlines assigned to him, he even set up his own private development, testing, and source code versioning environment to make his task easier (but he couldn’t tell anyone else - let alone The Big Boss). Meanwhile, every 2–3 weeks a developer or new-hire who couldn’t implement the boss’s whims quickly enough was fired, especially those who dared to propose spending time on testing or documentation or other “wastes of time”.

When Andrei finished the rewrite a month early, the backend system’s codebase was several times smaller, ran several times faster, and required 1/4 of the servers to run without crashing. The Big Boss was pleased and asked him to “take a quick look at the databases - you know, tune them up, make them run a little faster”, but Andrei had already found another job, and quit OldIsNewCo - eager to leave the memories of their Horrible Software in his past. After all, with a new job on the line, staying would have just been a “waste of time”.

Image source: 1, 2

21 Jan 19:13

Andreessen Horowitz Invests in Stack Exchange

by Joel Spolsky

(Note: This is a cross post from Joel on Software).

Stack Exchange Raises $40m

Today Stack Exchange is pleased to announce that we have raised $40 million, mostly from Andreessen Horowitz.

Everybody wants to know what we’re going to do with all that money. First of all, of course we’re going to gold-plate the Aeron chairs in the office. Then we’re going to upgrade the game room, and we’re already sending lox platters to our highest-rep users.

But I’ll get into that in a minute. First, let me catch everyone up on what’s happening at Stack Exchange.

In 2008, Jeff Atwood and I set out to fix a problem for programmers. At the time, getting answers to programming questions online was super annoying. The answers that we needed were hidden behind paywalls, or buried in thousands of pages of stale forums.

So we set out to build Stack Overflow with a single-minded, compulsive, fanatical obsession with serving programmers by building a better Q&A site.

Everything about how Stack Overflow works today was designed to make programmers’ jobs easier. We let members vote up answers, so we can show you the best answer first. We don’t allow opinionated questions, because they descend into flame wars that don’t help people who need an answer right now. We have scrupulously avoided any commercialization of our editorial content, because we want to have a site that programmers can trust.

Heck, we don’t even allow animated ads, even though they are totally standard on every other site on the Internet, because it would be disrespectful to programmers to strain their delicate eyes with a dancing monkey, and we can’t serve them 100% if we are distracting them with a monkey. That would only be serving them 98%. And we’re OBSESSED, so 98% is like, we might as well close this all down and go drive taxis in Las Vegas.

Anyway, it worked! Entirely thanks to you. An insane number of developers stepped up to pass on their knowledge and help others. Stack Overflow quickly grew into the largest, most trusted repository of programming knowledge in the world.

Quickly, Jeff and I discovered that serving programmers required more than just code-related questions, so we built Server Fault and Super User. And when that still didn’t satisfy your needs, we set up Stack Exchange so the community could create sites on new topics. Now when a programmer has to set up a server, or a PC, or a database, or Ubuntu, or an iPhone, they have a place to go to ask those questions that are full of the people who can actually help them do it.

But you know how programmers are. They “have babies.” Or “take pictures of babies.” So our users started building Stack Exchange sites on unrelated topics, like parenting and photography, because the programmers we were serving expected—nay, demanded!—a place as awesome as Stack Overflow to ask about baby feeding schedules and f-stops and whatnot.

And we did such a good job of serving programmers that a few smart non-programmers looked at us and said, “Behold! I want that!” and we thought, hey! What works for developers should work for a lot of other people, too, as long as they’re willing to think like developers, which is the best way to think. So, we decided that anybody who wants to get with the program is welcome to join in our plan. And these sites serve their own communities of, you know, bicycle mechanics, or what have you, and make the world safer for the Programmer Way Of Thinking and thus serve programmers by serving bicycle mechanics.

In the five years since then, our users have built 133 communities. Stack Overflow is still the biggest. It reminds me of those medieval maps of the ancient world. The kind that shows a big bustling city (Jerusalem) smack dab in the middle, with a few smaller settlements around the periphery. (Please imagine Gregorian chamber music).

View of Jerusalem

Stack Overflow is the big city in the middle. Because the programmer-city worked so well, people wanted to ask questions about other subjects, so we let them build other Q&A villages in the catchment area of the programmer-city. Some of these Q&A villages became cities of their own. The math cities barely even have any programmers and they speak their own weird language. They are math-Jerusalem. They make us very proud. Even though they don’t directly serve programmers, we love them and they bring a little tear to our eyes, like the other little villages, and they’re certainly making the Internet—and the world—better, so we’re devoted to them.

One of these days some of those villages will be big cities, so we’re committed to keeping them clean, and pulling the weeds, and helping them grow.

But let’s go back to programmer Jerusalem, which—as you might expect—is full of devs milling about, building the ENTIRE FUTURE of the HUMAN RACE, because, after all, software is eating the world and writing software is just writing a script for how the future will play out.

So given the importance of software and programmers, you might think they all had wonderful, satisfying jobs that they love.

But sadly, we saw that was not universal. Programmers often have crappy jobs, and their bosses often poke them with sharp sticks. They are underpaid, and they aren’t learning things, and they are sometimes overqualified, and sometimes underqualified. So we decided we could actually make all the programmers happier if we could move them into better jobs.

That’s why we built Stack Overflow Careers. This was the first site that was built for developers, not recruiters. We banned the scourge of contingency recruiters (even if they have big bank accounts and are just LINING UP at the Zion Gate trying to get into our city to feed on programmer meat, but, to hell with them). We are SERVING PROGRAMMERS, not spammers. Bye Felicia.

Which brings us to 2015.

The sites are still growing like crazy. By our measurements, the Stack Exchange network is already in the top 50 of all US websites, ranked by number of unique visitors, with traffic still growing at 25% annually. The company itself has passed 200 employees worldwide, with big plush offices in Denver, New York, and London, and dozens of amazing people who work from the comfort of their own homes. (By the way, if 200 people seems like a lot, keep in mind that more than half of them are working on Stack Overflow Careers).

We could just slow down our insane hiring pace and get profitable right now, but it would mean foregoing some of the investments that let us help more developers. To be honest, we literally can’t keep up with the features we want to build for our users. The code is not done yet—we’re dedicating a lot of resources to the core Q&A engine. This year we’ll work on improving the experience for both new users and highly experienced users.

And let’s not forget Stack Overflow Careers. I believe it is, bar-none, the single best job board for developer candidates, which should automatically make it the best place for employers to find developer talent. There’s a LOT more to be done to serve developers here and we’re just getting warmed up.

So that’s why we took this new investment of $40m.

We’re ecstatic to have Andreessen Horowitz on board. The partners there believe in our idea of programmers taking over (it was Marc Andreessen who coined the phrase “Software is eating the world”). Chris Dixon has been a personal investor in the company since the beginning and has always known we’d be the obvious winner in the Q&A category, and will be joining our board of directors as an observer.

This is not the first time we’ve raised money; we’re proud to have previously taken investments from Union Square Ventures, Index Ventures, Spark Capital, and Bezos Expeditions. We only take outside money when we are 100% confident that the investors share our philosophy completely and after our lawyers have done a ruthless (sorry, investors) job of maintaining control so that it is literally impossible for anyone to mess up our vision of fanatically serving the people who use our site, and continuing to make the Internet a better place to get expert answers to your questions.

For those of you who have been with us since the early days of Our Incredible Journey, thank you. For those of you who are new, welcome. And if you want to learn more, check out our hott new “about” page. Or ask!

06 Jan 05:32

Why Aren't We Using SSH For Everything?

by Soulskill

An anonymous reader writes: A post at Medium asks why, in this age of surveillance and privacy-related bogeymen, we aren't making greater use of SSH for our secure computing needs? "SSH is one of the most accessible secure protocols ever, second only to HTTPS of course. Let's see what we have so far: Binary protocol, mandatory encryption, key pinning, multiplexing, compression (yes, it does that too). Aren't these the key features for why we invented HTTP/2? Admittedly, SSH is missing some pieces. It's lacking a notion of virtual hosts, or being able to serve different endpoints on different hostnames from a single IP address. On the other hand, SSH does have several cool features over HTTP/2 though, like built-in client authentication which removes the need for registration and remembering extra passwords."

If the Programmer Won't Go To Silicon Valley, Should SV Go To the Programmer?

by Soulskill

theodp writes: "If 95% of great programmers aren't in the U.S.," Matt Mullenweg advises in How Paul Graham Is Wrong (a rejoinder to Graham's Let the Other 95% of Great Programmers In), "and an even higher percentage not in the Bay Area, set up your company to take advantage of that fact as a strength, not a weakness. Use WordPress and P2, use Slack, use G+ Hangouts, use Skype, use any of the amazing technology that allows us to collaborate as effectively online as previous generations of company did offline. Let people live someplace remarkable instead of paying $2,800 a month for a mediocre one bedroom rental in San Francisco. Or don't, and let companies like Automattic and Github hire the best and brightest and let them live and work wherever they like." Microsoft and Google — which hawk the very tools to facilitate remote work that Mullenweg cites — have shuttered remote offices filled with top talent even as they cry the talent sky is falling. So, is "being stubborn on keeping a company culture that requires people to be physically co-located," as Mullenweg puts it, a big part of tech's 'talent shortage' problem?" Chris Pepper also recently posted another reasoned rebuttal to Graham's post.

Inside Amsterdam's Efforts To Become a Smart City

by samzenpus

First time accepted submitter MollsEisley writes Throughout Amsterdam there are "Living Labs," or communities that act as petri dishes for ideas and initiatives to be tested before scaling them across the city. In IJburg, Amsterdam's youngest neighborhood, projects like free Wi-Fi and a new Fiber network, personalized television and transportation services, and a coworking space allow residents to experiment and test city projects to improve healthcare, environment, and energy programs in the city.

Testing

by andyleonard

It’s 2015 and we still have no flying cars. (Dear Doc Brown, we still need roads. Love, Andy) On a similar note, we’ve suffered through another year of software breaches and empty promises to fix the issues behind them. Brian Kelley ( Blog | @kbriankelley ) was right, We Don’t Care About Data and IT Security . Part of the issue is testing. Do you test for security? Estimation I am often asked, “Andy, how long will it take to test this solution?” My response is, “About as long as it takes to develop...(read more)

06 Jan 05:21

vRealize Operations 6.0 - New Merged UI

by Jason Gaudreau

vRealize Operations 6.0 has had a major overhaul, one of the biggest changes is the feel and look of the new product UI; which incorporates the previous vSphere UI and Custom UI into a single console. With this in mind, the new vRealize 6.0 UI is split up into different panels. If you look at the image below, the panel on the left is called the Navigation panel. We have the Center panel, which is where all the data is displayed in the middle of the portal. And finally, at the top, we have all our Dashboards.

At the top of the Navigation Panel, you will recognize five buttons that are links to parts of the product UI. vRealize Operations 6.0 is broken up into five sections. They include the Home link, Alert link, Environments link, Content link, and the Administration link.

We start off in the Home screen, the screen that is displayed when you initially log into the product UI. The Home screen is where your dashboards reside, including any custom dashboards you may create and any custom product solutions you may install (vCenter solution, Hyperic Solution, SCOM Solution). These are the same type of dashboards that we were used to seeing in the vCenter Operations Manager Custom UI at https://(vCenter Operations Manager host)/vcops-custom/; now integrated into a single viewing panel on the Home screen of vRealize Operations 6.0.

In the below diagram, I have clicked on the vSphere VMs CPU dashboard, which gives me the heatmap widgets we recognize from vCenter Operations Manager 5.x. It displays VMs sized by CPU demand % and colored by CPU content %.

You can click on the Dashboard List drop down field, and select a specific dashboard to drill down into specific issues.

The Actions drop down list provides the ability to create new dashboards, edit existing dashboards, delete dashboards, remove dashboards from the menu, and set a dashboard as the default.

Now that we have looked at the Home page, let's look at what I consider the most fundamental facet of vRealize Operations 6.0, the Environment view. As the shepherds of the virtualization infrastructure, we perform the majority of our work monitoring the health and stability of our infrastructure environment and keeping the wolves at bay.

When you click on the Environment link, it is going to bring you to the portal that displays three different inventory lists in the Central panel. These inventory lists include Group, Application, and Inventory.

You can customize the Groups and Applications tab, but the Inventory tab is a static list of all the objects being monitored by vRealize Operations 6.0. Groups and Applications were available in vCenter Operations 5.x. Groups were found in the vSphere UI and Applications were found under the Custom UI. Now both features have been collapsed into the Environment Overview in vRealize 6.0.

I am going to create a custom Application inventory list for a dashboard that I will create in a future post. Click on the green + sign at the top of the Applications tab. Next, I am going to select Custom for the Add Application type, and then click OK.

If you have created applications in the past, this should look very familiar to you.

On the Application Management page, I am going to give my application the name of Management Application Group. I am going to create a new Tier and call it the Management Tier. Under the object filter tree, I am going to select Object Type and then Virtual Machines. I will then drag the virtual machines I want included in my new application group from the bottom right hand panel to the Tier Objects pane.

After I click Save, the new application group shows up under the Applications tab inventory list.

Application groups provide you the ability to monitor specific application resources, it gives you a more concise view the health, risk, and efficiency of the specific application tier. These types of application inventory groups are vital when making custom dashboards for troubleshooting application issues, or providing a dashboard to your application developers so they can monitor the health and stability of the infrastructure platform supporting the business applications.

To finish off the post, we are going to look at the Navigation pane of the Environment portal. The Navigation pane is broken up into inventory trees. There are tree types (storage, hosts and clusters, networking, ect), tree instances (each vCenter would create an instance of the hosts and clusters tree), branches (relationships, and leafs (objects).

Each adapter can create new inventory trees when it is installed, for example if I install the EMC Storage Analytics MGMT pack it can make relationships to hosts and datastores that already exist. An important fact to remember, it is important to install the adapter packs that you have in vCenter Operations Manager 5.8 before you migrate to vRealize Operations 6.0. vRealize 6.0 will not know what to do with the data until you have established the necessary relationships.

If I click on a host, then I get information on the Center panel.

In the picture below, you will notice eight different tabs.

Summary tab: Displays active alerts for the object and for any descendent objects.
Alerts tab: Displays the active alerts for the current object.
Analysis tab: You use the Analysis tabs to evaluate the details about the state of your objects to help you resolve problems.
Troubleshooting tab: Helps to identify the root cause of problems that are not resolved by alert recommendations or simple analysis.
Details tab: This view provides multiple ways to look at different types of collected data by using trends, lists, distributions, and summaries.
Environment tab: The Environment tab shows how objects in your environment are related.
Projects tab: With projects, you can create scenarios to add virtual machines and hosts to the cluster and view the impact it will have on your environment to determine the best course of action.
Reports tab: Several out-of-the-box reports with the ability to generate fully customizable reports.

That is a brief overview of vRealize Operations 6.0's new merged UI, there have been several changes to help make it more uniform by merging in the vCenter Operations Manager 5.8 vSphere UI and Custom UI. It takes a few days to get use to the new layout, but it is much easier to use than having separate UI interfaces like the previous version.

06 Jan 05:20

Updated sys.dm_os_waiting_tasks script

Over the holidays I was playing around with parallelism and updated my sys.dm_os_waiting_tasks script to add in the scheduler distribution of the waiting threads. Here it is for your use.

Enjoy!

(Note that ‘text’ on one line does not have delimiters because that messes up the code formatting plugin):

/*============================================================================
  File:     WaitingTasks.sql

  Summary:  Snapshot of waiting tasks

  SQL Server Versions: 2005 onwards
------------------------------------------------------------------------------
  Written by Paul S. Randal, SQLskills.com

  (c) 2015, SQLskills.com. All rights reserved.

  For more scripts and sample code, check out 

http://www.SQLskills.com

  You may alter this code for your own *non-commercial* purposes. You may
  republish altered code as long as you include this copyright and give due
  credit, but you must obtain prior permission before blogging this code.
  
  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF 
  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED 
  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
  PARTICULAR PURPOSE.
============================================================================*/
SELECT
    [owt].[session_id],
    [owt].[exec_context_id],
    [ot].[scheduler_id],
    [owt].[wait_duration_ms],
    [owt].[wait_type],
    [owt].[blocking_session_id],
    [owt].[resource_description],
    CASE [owt].[wait_type]
        WHEN N'CXPACKET' THEN
            RIGHT ([owt].[resource_description],
                CHARINDEX (N'=', REVERSE ([owt].[resource_description])) - 1)
        ELSE NULL
    END AS [Node ID],
    --[es].[program_name],
    [est].text,
    [er].[database_id],
    [eqp].[query_plan],
    [er].[cpu_time]
FROM sys.dm_os_waiting_tasks [owt]
INNER JOIN sys.dm_os_tasks [ot] ON
    [owt].[waiting_task_address] = [ot].[task_address]
INNER JOIN sys.dm_exec_sessions [es] ON
    [owt].[session_id] = [es].[session_id]
INNER JOIN sys.dm_exec_requests [er] ON
    [es].[session_id] = [er].[session_id]
OUTER APPLY sys.dm_exec_sql_text ([er].[sql_handle]) [est]
OUTER APPLY sys.dm_exec_query_plan ([er].[plan_handle]) [eqp]
WHERE
    [es].[is_user_process] = 1
ORDER BY
    [owt].[session_id],
    [owt].[exec_context_id];
GO

The post Updated sys.dm_os_waiting_tasks script appeared first on Paul S. Randal.

Mrdenny

Shared posts

Configuration

Known Limitation

SQL JDBC Driver Versioning and Files

Setup

Paging Scenarios

An alternate approach

Testing the clustering key sort

Testing the phone book sort

Testing the user-defined sort

Higher volume

Conclusion

Stack Exchange Raises $40m