Shared posts

02 Aug 01:01

In-Memory OLTP Q & A: Myths and Realities

by SQL Server Team

Since SQL Server In-Memory OLTP (formerly known as project code-named “Hekaton”) was unveiled at PASS last November, there has been a lot of conversation and buzz about it. Some of the discussion was well informed and accurate, and some… not so much. With this series, and with online documentation at http://go.microsoft.com/fwlink/?LinkId=302261, we will be giving you much more full and complete information than we’ve been able to previously. In the meantime, we’ve heard lots of the rumors, myths, and misperceptions going around out there. In this post, I’ll identify some of the more common ones, and discuss the realities behind the myths.

Myth #1: In-Memory OLTP is a recent response to competitors’ In-memory offerings

Reality

Project code-named “Hekaton” was started around 5 years ago in response to business and hardware trends. We started an incubation project in partnership with Microsoft Research to imagine what a database engine, designed from scratch for today’s hardware realities, would look like. The In-Memory OLTP feature is the outcome of that incubation.

Myth #2:

In-Memory OLTP is the same as the old SQL 7.0 feature DBCC PINTABLE; allow pinning buffer pool pages or tables in Memory.

Reality

In-Memory OLTP uses a completely new design built from the ground up to optimize for efficient In-Memory data operations. Data in memory-optimized tables is not organized in pages, and does not use the Buffer Pool. By dispensing with data structures and other infrastructure which is there to facilitate paging subsets of data between disk and memory, we can get to a much leaner and more efficient data engine while still retaining the essential characteristics of the data engine.

Myth #3:

In-Memory databases are new separate products

Reality

Unlike most or all of the other in-memory database products on the market, In-Memory OLTP is a feature fully integrated into SQL Server 2014. This means that no separate installation is needed. There is no need to learn different tools. Most importantly, it allows an incremental investment strategy, where you selectively move tables to the most appropriate storage for the data represented by each table. In addition because it is built into core SQL Server you can leverage other SQL Server functionality in addition to In-Memory OLTP.

Myth #4:

You can run an app on In-Memory OLTP with NO changes whatsoever

Reality

There are at least some changes, at minimum changing the schema to be In-Memory ready. While we are working towards this ideal, in reality the application will typically need some code modification. In many cases, these changes can be isolated in Stored Procedures used to access the database, and the actual application need not change. In future releases of SQL Server, we plan to increase our compatibility, and thus reduce the need for modifications, but there will always be a need to at least change the schema to designate the memory optimized tables.

Myth #5:

All applications will get 100X Improvement with In-Memory OLTP

Reality

Results will vary depending on the application and the level of effort chosen and applied. Many applications see very significant performance gains, on average 10X, with results up to 30X. Much like optimizing hardware configurations, the results you get vary significantly depending on how much effort you put into the project. You can do a very simplistic, and low cost project, defining hot tables as memory optimized, and doing no other changes, or you can rewrite stored procedures to be natively compiled, investing more effort, and resulting in much better results. It should also be noted that not all applications are appropriate for In-Memory OLTP technology. In some cases, other features such as columnstore will be more successful.

Want to learn even more about In-Memory OLTP? Download SQL Server 2014 CTP1 and get started today!

02 Aug 01:00

What’s New for Columnstore Indexes in SQL Server 2014

by SQL Server Team

In SQL Server 2012, we introduced the memory optimized columnstore index. This resulted in huge query performance improvements but columnstore indexes had one significant disadvantage – they couldn’t be updated directly but had to be rebuilt on a base table. In SQL Server 2012, anyone wanting to take advantage of a this new index and also wanting to update the data behind it had to either drop the index and update the data, do some partition switching or unioning queries to the columnstore index and a regular row store containing the changing data. These workarounds are no longer required in SQL Server 2014 as we have enhanced the columnstore to be a pure columnar store, so indexing is no longer required..

In this next version, we’re introducing enhancements to the In-Memory ColumnStore for data warehousing implemented as a clustered columnstore index (or CCI) on a table. The data in a CCI is grouped and stored for every column in the table. Unlike the columnstore index, the CCI is the data – there is no other underlying data structure.

(Short aside - this might be a little confusing especially to those that haven’t lived and breathed SQL Server for the last little while. SQL has had two traditional table structures - heaps and clustered tables. Heaps, like the name suggests, aren’t ordered by anything. Clustered tables are ordered by the clustered index – in fact, the clustered index is the table. So CCI is analogous to a clustered index in that it represents the table as well.)

Unlike a columnstore index, CCIs are updateable just like any other table. (The internal details of what happens under the covers when updating a CCI are interesting and we’ll get to that in a separate post.) Now that the DML problem associated with columnstore indexes have gone away with CCIs, we can now make apples to apples comparison between columnstore and row store.

Organizing the table by column provides substantial advantages. The first one is improved query performance. When data is organized and compressed by individual columns, scan operations are significantly faster. We typically expect data warehousing type queries (ie, queries with aggregates involving large scans) to be about 10x faster (no kidding!). I’ve seen some almost unbelievable performance numbers much better than that. So if you’re developing or managing a data warehousing application, you owe yourself a favor to check this out solely for the query performance.

Second thing to expect with CCI is about a 2x improvement in data compression compared to row store page compression. Mileage may vary based on variability of values in each column, but results in this regard have been pretty consistent in my experience.

There’s also a new compression format “COLUMNSTORE_ARCHIVE” that’s available for columnstore indexes only. With the new archival compression, we can further compress the data which results in even less disk space being used. Internal tests have shown compression rates roughly double with the combination of a columnstore index and archival compression.

In our next blog entries, we’ll go into more details and show examples on how the new in-memory columnstore functionality works.

SQL Server 2014 CTP1 is now available for download here.

02 Aug 01:00

Should IT professionals learn to type? – Investing in yourself

by Greg Low

Why learn?

I was listening today to one of Scott Hanselman’s awesome podcasts with Miguel de Icaza and during the show they discussed things that you really should invest in. One list that came up was a good bed, a good chair, and to learn to type. Most of us spend 1/3 of the day in a bed and 1/3 of the day in a chair, so they seem like no-brainers. Typing is an interesting addition to that list.

Back when I was in school, in those sexist old days, typing wasn’t something that the boys learned. It was something that was taught to the girls, and mostly to those that weren’t heading in an academic direction. Up until about 1990 I was a pretty fast two-finger typist (probably more like a four-finger typist). I could cut code pretty quickly and I wasn’t too concerned about it. Occasionally though, I kept feeling that I could do much better if I learned to type. Eventually I convinced myself that if a sizeable percentage of the population could do it, surely I could learn.

Is it hard to learn?

At that time, some pretty decent typing programs for the PC had started to appear. I decided to try to learn to type and chose TypeQuick. It’s interesting (and good for them) to see that they still exist today. So many applications from those days have long since gone. I particularly liked the way that it constantly pushed you just a little bit faster than what you were achieving. I haven’t tried the program since then but I hope it’s just improved over time. But regardless, there are many of these types of programs around.

But then the real problems began. Most of these types of programs assume that you can spend an hour a day learning to type, and in a few weeks you’ll be able to do so. The implicit assumption though is that’s the only time that you’ll be typing during that time. The hassle with this is that if you type for a living (by cutting code or interacting with IT systems), spending an hour a day doing it the right way, then the rest of the day doing it the wrong way isn’t a formula for success.

I decided that I needed to break that impasse. Early one fine Sunday morning, I started doing the course and I kept going until I had pretty much finished the course, at around midnight. By then I could barely move my hands at all but I had learned all the basics.

What I hadn’t figured on though was that the hardest part was to come. During the next four weeks or so, I had tight deadlines that I knew I could meet by typing my old way, and yet I had to force myself to type correctly, even though it was at a fraction of the speed that I could do the wrong way.

I did force myself to put up with the stress and after about four weeks I was back to about the same typing speed. The brilliant part is that I was then able to keep getting faster and faster. I’d probably now average about three or four times faster than what I could do in my heyday of four-finger typing.

What difference does it make?

Clearly if you are writing a bunch of code, you can do so much more productively but there are also many subtle improvements in what you achieve, not just how quickly:

  • You are much more likely to use clear (but often longer) names for objects
  • You are much more likely to write documentation while coding (Miguel noted that people who can’t type tend to always put off tasks like documentation and I have to say that I’ve seen the same)
  • You are much more prepared to throw away code and replace it with better quality code (I’ve noticed that people who don’t type are much more likely to cling onto their first attempts at any piece of code)
  • While it’s possibly an even more minor point, you will simply seem more professional to those that you work with. Professionals invest time in skills that matter in their profession. If you’re reading this, chances are that typing is in that category for you.

I’m really surprised that typing isn’t considered a core skill for IT people today. Perhaps there will be a time in the future when it won’t matter but it does matter today. Curiously though, it’s often viewed as a less-important skill to master. But those who have done so will all tell you how important it is.

When I worked at a university, I was pushing for students to learn typing in the first two weeks of their Computer Science degrees. The idea was almost mocked by most lecturers as typing wasn’t something that was taught by universities. Yet the same university struggled to have enough resources for students to use, and the resources they had were tied up by people that took forever to type anything. The real irony is that in most courses, there isn’t much other useful work that the students could do in the first week or two anyway.

So, if you have made it to this point in the blog, and you can’t yet type, I hope that you’ll consider it.

02 Aug 00:57

July, the 31 Days of SQL Server DMO’s – Day 19 (sys.dm_exec_query_stats)

by Tamarick Hill

The sys.dm_exec_query_stats DMV is one of the most useful DMV’s out there when it comes to performance tuning. If you have been keeping up with this blog series this month, you know that I started out on Day 1 reviewing many of the DMV’s within the ‘exec’ namespace. I’m not sure how I missed this one considering how valuable it is, but hey, they say it’s better late than never right?? On Day 7 and Day 8 we reviewed the sys.dm_exec_procedure_stats and sys.dm_exec_trigger_stats respectively. This sys.dm_exec_query_stats DMV is very similar to these two. As a matter of fact, this DMV will return all of the information you saw in the other two DMV’s, but in addition to that, you can see stats for all queries that have cached execution plans on your server. You can even see stats for statements that are ran Ad-Hoc as long as they are still cached in the buffer pool.

To better illustrate this DMV, let have a quick look at it:

SELECT * FROM sys.dm_exec_query_stats

image

image

image

image

As you can see, there is a lot of information returned from this DMV. I wont go into detail about each and every one of these columns, but I will touch on a few of them briefly. The first column is the ‘sql_handle’, which if you remember from Day 4 of our blog series, I explained how you can use this column to extract the actual SQL text that was executed. The next columns statement_start_offset and statement_end_offset provide you a way of extracting the exact SQL statement that was executed as part of a batch. The plan_handle column is used to extract the Execution plan that was used, which we talked about during Day 5 of this blog series. Later in the result set, you have columns to identify how many times a particular statement was executed, how much CPU time it used, how many reads/writes it performed, the duration, how many rows were returned, etc. These columns provide you with a solid avenue to begin your performance optimization. The last column I will touch on is the query_plan_hash column. A lot of times when you have Dynamic SQL running on your server, you have similar statements with different parameter values being passed in. Many times these types of statements will get similar execution plans and then a Binary hash value can be generated based on these similar plans. This query plan hash can be used to find the cost of all queries that have similar execution plans and then you can tune based on that plan to improve the performance of all of the individual queries. This is a very powerful way of identifying and tuning Ad-hoc statements that run on your server.

As I stated earlier, this sys.dm_exec_query_stats DMV is a very powerful and recommended DMV for performance tuning. You are able to quickly identify statements that are running on your server and analyze their impact on system resources. Using this DMV to track down the biggest performance killers on your server will allow you to make the biggest gains once you focus your tuning efforts on those top offenders.

For more information about this DMV, please see the below Books Online link:

http://msdn.microsoft.com/en-us/library/ms189741.aspx

Follow me on Twitter @PrimeTimeDBA

02 Aug 00:55

FAQ on ADFS - Part 1

by Greg Jaworski [MSFT]

Hello everyone, Jasmin here again and this time I am writing about Active Directory Federation Server (ADFS). Lately, I have been getting several questions from most of my customers and some of my peers around ADFS deployment, planning, setup, implementation etc. While addressing these questions, I realized that I was answering similar type of queries especially when it was a first time ADFS deployment effort. I have therefore created a list of common Q/A around ADFS in hopes that it would benefit those looking into federation for the first time.

What is ADFS?

ADFS helps you use single sign-on (SSO) to authenticate users to multiple web applications over the life of a single session. This is accomplished by securely sharing digital identity and rights (Claims) across security and enterprise boundaries. Some of the ADFS uses can be found here

What are the different versions of ADFS? Which one is the latest?

There are three versions of ADFS.

- AD FS 1.0 - released with Windows Server 2003 R2 as part of the operating system and could be installed as a Windows component.

- AD FS 1.1 - released with Windows Server 2008 and was carried into Windows Server 2008 R2. In both editions, AD FS was installed from the Server Manager as a role. There were minimal changes from AD FS 1.0 to AD FS 1.1.

-  AD FS 2.0 was released after Windows Server 2008 R2. It was released to the web and is free to download. It requires at least Windows Server 2008 SP2 to install. Two versions (x86 and x64) are available for Windows Server 2008, while only the x64 version is available for Windows Server 2008 R2.

- ADFS 2.1 was released to Windows Server 2012 as part of the operating system and therefore, can be installed as a Role from Server Manager.

One thing to note is that, AD FS 1.x is limited in its standards support which includes WS-Federation Passive Requestor Profile (browser) and SAML 1.0 TOKENS while AD FS 2.0 extends standards support for WS-Federation. It supports WS-Federation PRP, WS-Federation Active Requestor Profile, SAML 1.1/2.0 TOKENS, SAML 2.0 Operational Modes, IdP Lite/SP Lite/eGov 1.5

What is the benefit of installing ADFS on Windows Server 2012 versus on Windows Server 2008 R2?

- In Windows Server 2012, ADFS 2.1 is released as part of the operating system and is installed from the Server Manager as a role. Server Manager provides configuration wizard pages that perform validation checks and automatically install all the services that AD FS depends on. Whereas, in Windows Server 2008 SP2 or Windows Server 2008 R2, ADFS 2.0 must be installed from the web. You will also need to install the update rollup 3 for Windows Server 2008 and 2008 R2 which is located here. Furthermore, With Windows Server 2012, the AD FS server role now includes new cmdlets that you can use to perform PowerShell-based deployment within your federated identity installations and environments. Detailed cmdlets information can be found here. Lastly, with Windows Server 2012, AD FS can be integrated with Dynamic Access Control scenarios allowing AD FS to consume AD DS claims that are included in Kerberos tickets as a result of domain authentication. More information on claims can be found here

Which AD FS configuration database store should I choose, Windows Internal Database (WID) or SQL?

- The AD FS configuration database stores all the configuration data. It contains information that a Federation Service requires to identify partners, certificates, attribute stores, claims, etc. You can store this configuration data in either a Microsoft SQL Server 2005 or newer database or the Windows Internal Database (WID) feature that is included with Windows Server 2008, Windows Server 2008 R2 and Windows Server 2012. Following is a short description of

WID Advantages

WID Disadvantages

Very easy to setup and implement

Supports five federation servers in a farm

Load balancing and fault tolerance is possible if setup as a farm.

SAML artifact resolution and SAML/WS-Federation token replay detection feature is not available

Supports multiple Federation Servers in a farm (limits to 5 federation server in a farm)

It is not supported if there is more than 100 claim trust providers trust or more than 100 relying party trusts.

More info: In a farm with WID as the database, the first server in the farm act as the primary server and host a read/write copy of the database. Secondary servers then replicate inbound the configuration data into their read-only database. They are fully functional federation members and can service the clients just like the Primary server. They are just unable to write any configuration changes to the WID which does not take place every day.

SQL Advantages

SQL Disadvantages

Supports multiple federation servers (not subject to the limitation of WID)

Additional setup complexities. Require PowerShell to install it

Load balancing and fault tolerance

SQL cluster introduces another potential point of failure

Easily Scalable

SQL server must be performing well to service requests

SAML artifact resolution and SAML/WS-Federation token replay detection supported

 

If the Primary Server in the farm is down, what happens?

Another server in the farm can be configured as the primary server. Below is the PowerShell command to run on the secondary server which you want to make primary:

Add-PsSnapin Microsoft.Adfs.PowerShell

Set-AdfsSyncProperties -Role PrimaryComputer

Once the primary federation server is set run the following PowerShell commands on the other secondary federation servers to sync them with the new Primary ServersCommand to run on the other farm member servers:

Add-PsSnapin Microsoft.Adfs.Powershell

Set-AdfsSyncProperties -Role SecondaryComputer -PrimaryComputerName {FQDN of the Primary Federation Server}

Is it possible to move from WID to SQL at some point in the future?

o Yes it is supported to move from WID to SQL. Detailed steps are documented here

Is SAML artifact resolution and SAML/WS-Federation token replay detection feature required by most Relying Parties?

o From my experience most Relying Parties do not require this feature. However, there are some that do. So it would be wise to check on that before deciding the database configuration store. If that is a requirement, the SQL must be selected.

What is the difference between a single ADFS server versus a farm? Which one is better?

ADFS can be setup as a

o Standalone federation server.

o Farm Federation Server using WID

o Farm Federation Server using SQL

Farm federation server is definitely a better option than a standalone federation server for the obvious reasons – scalability and redundancy. Standalone federation server only support a single server and only store configuration information on a Windows Internal Database (WID). Of course It is easy to setup and its best for lab environment but lacks scalability and redundancy. Moreover, you cannot add more than one server to the Standalone federation server. However, with a farm federation server, you can start a farm with one single ADFS server and add more ADFS servers to the farm at that time or sometime in the future. I often get this question, can a farm federation server using WID function with one server? And the answer is YES! But remember you cannot benefit from load balancing and redundancy since there is only one server in the farm. For more information on Federation Server using WID or SQL please refer to the question of which database to choose.

Which type of certificates does AD FS require?

Basically you need three types of certificate.

o Service communication certificate

§ AD FS uses this certificate to enable HTTPS which is a requirement for traffic to and from the federation server and federation server proxies ( to secure communication) So it is basically a SSL certificate which needs to be installed on the IIS for each federation server and federation server proxy

o Token signing certificate

§ AD FS uses this certificate to digitally sign outgoing AD FS tokens. This is not used to secure data but in fact it is used to ensure the integrity of the security tokens as they pass between the federation servers and application server via the client computer.

o Token decrypting certificate

§ AD FS 2.0 and above has the ability to encrypt the contents of the AD FS tokens. This is in addition to having these tokens signed by the server’s token signing certificate.

Where can I obtain the required certificates from?

There are several options and each have their pros and cons.

o Server communication certificate

§ This certificate must be trusted by the client computers so it is recommended that in a production environment this certificate is obtained from a public CA. Other alternative is to use your enterprise CA (PKI) to issues this cert however, you will need to ensure that this certificate is trusted by all client computers. You may have to use Group Policy to manually push down this certificate. Bear in mind that if the client machines are not joined to the domain, they may not be able to trust your internal certificate which could result in bad user experience such as receiving security alert prompts when they try to access the federated resources. In your test environment, you can easily use a self-signed certificate if you wish as security is usually not of a concern in a lab environment.

o Token Signing Certificate

§ This certificate can be issued via enterprise CA, public CA or by creating a self-signed certificate. The way it is installed depends on how you create the AD FS farm. We recommend that all federation servers in the farm use the same token signing certificate. Hence you can install this certificate from the CA on a federation server and export the cert along with the private key to other federation servers in the farm and save the cost involved in obtaining a certificate from the public CA. However, the option that I personally favor is to allow what AD FS 2.x does by default i.e. it creates a self-signed certificates for signing tokens. I like this option because the maintenance is very low. It has a validity of one year after which it must be renewed however, AD FS provides the capability for automatic renewal (certificate rollover) for self-signed certificates before expiry and if the relying party trust is configured for automatic federation metadata, it will automatically provide the new public key to the relying party.

o Token Decrypting Certificate

§ AD FS 2.x uses the Secure Sockets Layer (SSL) certificate for Internet Information Services (IIS) as the default decryption certificate.

How can I check if my ADFS server is operating successfully?

Check for Event ID 100 under Applications and Service Logs | AD FS | Admin. This event verifies that the federation server was able to successfully communicate with the federation service.

Is there a checklist that I can follow to setup ADFS in my environment?

ADFS 2. 0 http://technet.microsoft.com/en-us/library/dd807086(v=ws.10).aspx

I also found a checklist specifically for Windows Server 2012 which is located at http://technet.microsoft.com/en-us/library/dd807086.aspx

That’s it for now. As I get more questions, I will create part 2 of the ADFS FAQ.

Cheers,

Jasmin Amirali

02 Aug 00:53

SQL Server 2014: In-memory OLTP Engine, code-name: Hekaton

by James Serra

What looks to be the biggest and best new feature in SQL Server 2014 is called the “In-memory OLTP Engine”, code-name: Hekaton.  In short, Hekaton is a SQL Server In-memory OLTP Engine.  Hekaton provides in-memory OLTP capabilities built into core SQL Server database to significantly improve the performance of your database application.  Hekaton is installed with the SQL Server 2014 Engine without requiring any additional actions and allows in-memory performance benefits without rewriting your database application.  You can also increase performance of existing SQL Server applications without having to refresh your hardware.  Hekaton is easy to deploy and allows you to access the other rich features in SQL Server, while taking advantage of in-memory performance.

This complements the existing in-memory data-warehousing and business-intelligence (BI) capabilities already in SQL Server via xVelocity.  Hekaton could benefit DW/BI solutions in two ways that I see: 1) By speeding up the time it takes to pull data from a source system that is in-memory into the data warehouse, and 2) Instead of persisting staging data in tables on disk, you can store them in memory.

Here is the description from the Microsoft site:

Hekaton in-memory OLTP: In-memory database technology that will provide breakthrough performance gains 10 times for existing apps and up to 50 times for new applications optimized for in-memory performance with no additional hardware required.  There will be a diagnostic tool that will suggest which databases and tables are the best candidates to run in-memory.  The “Hekaton” codename will fade and  it will be referred to as the SQL Server In-Memory OLTP Engine.

More info:

In-Memory OLTP White Paper

Edgenet Gain Real-Time Access to Retail Product Data with In-Memory Technology

First steps with Extreme Transaction Processing – Hekaton

SQL Server 2014 In-Memory Technologies: Blog Series Introduction

Getting Started with SQL Server 2014 In-Memory OLTP

SQL Server 2014 CTP1 native compiled Hekaton procedures are faster than regular procedures

Supported and unsupported datatypes with Hekaton tables

Hekaton data and code – where does that stuff actually live?

Geek City: SQL Server 2014 In-Memory OLTP (“Hekaton”) Whitepaper

First baby steps with SQL Server 2014 CTP1

Exploring In-memory OLTP Engine (Hekaton) in SQL Server 2014 CTP1

SQL 2014 In-Memory OLTP ‘Hekaton’: training videos and white papers

Hekaton: Transforming Query Plans into C Code

In-Memory (Memory Optimized) Tables in SQL Server 2014

In-Memory OLTP Q & A: Myths and Realities

Architectural Overview of SQL Server 2014’s In-Memory OLTP Technology

SQL Server 2014 hastens transaction processing

A Tour of the Hekaton AMR Tool

Hardware Considerations for In-Memory OLTP in SQL Server 2014

SQL Server 2014 In-Memory OLTP: App Migration Scenario Leveraging the Integrated Approach

Extreme Transaction Processing (XTP, Hekaton) – the solution to everything?

How Memory-Optimized Database Technology is Integrated into SQL Server 2014

02 Aug 00:51

SQL Community Speakers – Why We Do It

by Tim Radney

Image

I started this post over a year ago and have just now come back around to finish it.  Over the years I have been asked about a few misconceptions about speakers for SQL Saturday’s.

Over the past several years as I have been involved with speaking and organizing SQL Saturday’s. Like many I love hanging out with the other speakers, catching up and meeting new people attending the event. There are few greater rewards than being able to share knowledge and empower others with what you know.

Over and over again I have either been asked or over heard individuals making comments about how much it must cost to cover the expenses of the speakers.  When I am asked or over hear these comments I quickly explain that most speakers are there as volunteers.

I explain that the majority of the speakers at SQL Saturday’s are there on their own dime. Regardless if they are local speakers, regional, or flew in from across the country most have paid their own way. There are the few exceptions with speakers that started working for the software companies but many of them are still volunteering their personal time because they love to pay it forward as well.

When you ask any speaker why they give up their weekend and dip into their families budget to come to these events to speak, you will get the same answer, “to pay it forward”. Every speaker I have spoken to comments on how much the SQL Community has helped their career. Getting involved to share their knowledge is their way of paying it forward and give back to the community.

That is how I got involved too.  I had attended two SQL Saturdays before attending Summit in 2010. (I actually learned about SQL Saturdays at Summit 2009) While attending Summit 2010 Jose Chinchilla (blog | twitter) convinced me to submit an abstract to SQL Saturday 62 in Tampa FL.  That is all it took for me.  I spoke at my first event, got positive feed back and helped several people in my session with real world issues.  It was such a rush.  I haven’t let up since then.  Check out my Community Page for details.

A few folks have mentioned that it must be a great way for consultants to pick up new clients.  Of course getting in front of people in your field demonstrating your subject matter expertise might be able to land you some new work, but ask any independent consultant who gets up their to speak why they do it.  They will tell you the same thing, to share knowledge and try to help as many people as they can.  I hear many of them all the time offer their email address, twitter handle, etc and say they are willing to help anyone.  The common saying is “I will answer any question and give guidance, if that question turns into several hours of help, then we are getting into billable time”.

Want proof of that, monitor the hashtag #sqlhelp on twitter and see the level of professionals help people out from all over the world.

What really impresses me about most of these consultants is that during there sessions, they never talk about their company other than their BIO slide.  What they do share are all their real world experiences.  Now that is valuable information.  If you are in need of some professional contracted help at work, who better than those who help make this community great.

When I saw the image above I couldn’t help but think about our SQL Community. We are in it for the outcome of our community.  A bigger and better community.

Share

02 Aug 00:51

How many SQL Server DBAs does an organization need?

by RickHeiges
"How many SQL Server DBAs do we need?" - This is a question that often comes up in conversations with customers. Essentially, customers want to know if they have enough DBAs or too many. This is not a trivial question. If you do some research online via your favorite search engine, you will most likely come across numbers such as 40-65 DBs per DBA (for SQL Server). I remember finding another number relating to storage space instead of the number of DBs; the range was 3-5TB per DBA. I have had custoemrs...(read more)
02 Aug 00:51

What’s New in 2012 R2: Enabling Open Source Software

by BradAnderson

Blog-Header_Graphic_DataCenter 
Part 4 of a 9-part series.

The Common Engineering Criteria (CEC) that I’ve talked about in previous posts have been instrumental in ensuring that the teams contributing to 2012 R2 met rigorous and specific criteria in areas like Manageability, Virtualization Readiness, Data Center & Enterprise Readiness, Reliability, Hardware Support, and Interoperability. The original idea behind the creation of the CEC back in the early 2000’s was to drive consistency across all of the Microsoft Server workloads and applications – simplifying the experience of using multiple Microsoft solutions together, as well as driving down the total cost of owning and operating our solutions.

The benefit of the CEC is that it ensures all our enterprise solutions come ready to be deployed, operated, and managed “out-of-the-box” in the Microsoft Clouds. To get really specific, each and every workload team across Microsoft (e.g., all the roles in Windows Server, the Office Workloads, etc.) deliver the knowledge that instructs System Center how to operate that workload in the Microsoft Clouds. If you ever ask the question, “Which Cloud is best suited for running Microsoft workloads?” the answer should be clear – the Microsoft Clouds!

One of the best things about this 2012 R2 wave of products was how its unified planning and engineering milestones allowed teams from across Microsoft to work on different aspects of the release in parallel. This process brought together a genuinely amazing collection of solutions and value while leveraging the expertise of each team and each individual. This is a really important point to think about. If I were to ask who has the most knowledge and expertise in how Windows Server should be deployed and operated, the answer is simple: The Windows Server team at Microsoft. Therefore, whenever you deploy Windows Server in the Microsoft Clouds you have an enormous built in advantage: The expertise of the entire Windows Server team. This far-reaching expertise is expressed through the Common Engineering Criteria in the form of any future updates that will become available – every one of those updates will be built, tested, and approved by the Windows Server team, aka the world’s foremost Windows Server deployment experts.

There are a lot of great surprises in these new R2 releases – things that are going to make a big impact in a majority of IT departments around the world. Over the next four weeks, the 2012 R2 series will cover the 2nd pillar of this release: Transform the Datacenter. In these four posts (starting today) we’ll cover many of the investments we have made that better enable IT pros to transform their datacenter via a move to a cloud-computing model.

This discussion will outline the ambitious scale of the functionality and capability within the 2012 R2 products. As with any discussion about the cloud, however, there are key elements to consider as you read. Particularly, I believe it’s important for all of our discussions – whether online or in person – to remember that cloud computing is a computing model, not a location. All too often when someone hears the term “cloud computing” they automatically think of a public cloud environment. Another important point to consider is that cloud computing is much more than just virtualization – it is something that involves change: Change in the tools you use (automation and management), change in processes, and a change in how your entire organization uses and consumes its IT infrastructure.

Microsoft is extremely unique in this perspective, and it is leading the industry with its investments to deliver consistency across private, hosted and public clouds. Over the course of these next four posts, we will cover our innovations in the infrastructure (storage, network, compute), in both on-premise and hybrid scenarios, support for open source, cloud service provider & tenant experience, and much, much more.

As I noted above, it simply makes logical sense that running the Microsoft workloads in the Microsoft Clouds will deliver the best overall solution. But what about Linux? And how well does Microsoft virtualize and manage non-Windows platforms, in particular Linux?  Today we’ll address these exact questions.

Our vision regarding other operating platforms is simple: Microsoft is committed to being your cloud partner. This means end-to-end support that is versatile, flexible, and interoperable for any industry, in any environment, with any guest OS. This vision ensures we remain realistic – we know that users are going to build applications on open source operating systems, so we have built a powerful set of tools for hosting and managing them.

A great deal of the responsibility to deliver the capabilities that enable the Microsoft Clouds (private, hosted, Azure) to effectively host Linux and the associated open source applications falls heavily on the shoulders of the Windows Server and System Center team. In today’s post Erin Chapple, a Partner Group Program Manager in the Windows Server & System Center team, will detail how building the R2 wave with an open source environment in mind has led to a suite of products that are more adaptable and more powerful than ever.

As always in this series, check out the “Next Steps” at the bottom of this post for links to a variety of engineering content with hyper-technical overviews of the concepts examined in this post.

* * *

 

During the planning process for a release we look at the assumptions we’re making, challenge them, and then look ahead to demarcate how our industry will be shaped by changing market conditions. While open source software has been present in the datacenter for several years, as we look at what a modern datacenter entails it is increasingly clear that enabling open source software is a key tenet in our cloud offerings.

Not only do enterprises run key workloads based on Linux and UNIX, but, in this cloud-first world, many applications leverage open source components. To provide our customers with one cloud infrastructure, one set of system management tools, and one set of paradigms to transform their datacenter with the cloud, we knew that we needed to ensure that Windows is the best platform to run Linux workloads as well as open source components. With Windows Server 2012 R2, System Center 2012 R2, and in the public-cloud with Windows Azure, IT pros now have this assurance.

Whether it’s on-premise management of your datacenter, running in the Microsoft public cloud, or a hybrid of the two, you can now run and manage Windows and Microsoft applications, as well as run and manage Linux, UNIX, and open source applications – all with a consistent experience.

Windows Server: The Best Infrastructure to Run Linux Workloads

Consider this scenario: You are an infrastructure administrator for a large hosting company, or for an IT department within an enterprise organization. Your customers very likely want to host and manage complex applications that require services running on multiple Windows and Linux guest virtual machines. Today, you may be using separate hosting and management tools for the Windows and Linux environments. This means separate hypervisors, separate management tools, and separate user interfaces for your customers. You may even have separate technical staff for the two environments! This bifurcation dramatically increases complexity and costs – both for you and for your customers.

Windows Server 2012 R2 and System Center 2012 R2 offer consolidation on a single infrastructure to run and manage Windows and Linux guest virtual machines. With a single infrastructure, operations and processes are simplified considerably. For example, you no longer have to deal with the complexity of handling Windows one way and Linux another, and complex applications that have Windows and Linux components are no longer a special case that must span two infrastructures. Now you can spend more time providing great service to your customers and less time dealing with operating system differences. Your customers also get the advantage of a single, unified, view of their applications and workloads, with consistent and unified reporting, resource usage, and billing.

At the core of enabling this single infrastructure is the ability to run Linux on Hyper-V. With the release of Windows Server 2012 Hyper-V, and enhanced by the updates in the 2012 R2 version, Hyper-V is at the top of its game in running Windows guests. We’re delivering this with engineering investments in Hyper-V, of course, but also in the Linux operating system.

You read that correctly – some of the work we are doing at Microsoft involves working directly with the Linux community and contributing the technology that really enables Hyper-V and Windows to be the best cloud for Linux.

Here’s who we’ve done it: Microsoft developers have built the drivers for Linux that we call the Linux Integration Services, or “LIS.” Synthetic drivers for network and disk provide performance that nearly equals the performance of bare hardware. Other drivers provide housekeeping for time synchronization, shutdown, and heartbeat. Directly in Hyper-V, we have built features to enable live backups for Linux guests, and we have exhaustively tested to ensure that Hyper-V features, like live migration (including the super performance improvements in 2012 R2), work for Linux guests just like they do for Windows guests. In total, we worked across the board to ensure Linux is at its best on Hyper-V.

To ensure compliance, Microsoft had done this LIS development as a member of the Linux community. The drivers are reviewed by the community and checked into the main Linux kernel source code base. Linux distribution vendors then pull the drivers from the main Linux kernel and incorporate them into specific distributions. LIS is currently a built-in part of these distributions:

  • Red Hat Enterprise Linux 5.9 and 6.4
  • SUSE Linux Enterprise Server 11 SP2 and SP3
  • Ubuntu Server 12.04, 12.10, and 13.04
  • CentOS 5.9 and 6.4
  • Oracle Linux 6.4 (Red Hat Compatible Kernel)
  • Debian GPU/Linux 7.0

Updates to LIS for the 2012 R2 release tackle several key issues needed to bring Linux to the same baseline as Windows when running on Hyper-V:

  • Dynamic memory: Increase Linux VM density on Hyper-V by having Hyper-V automatically add and remove physical memory for Linux guests based on the guest needs, just like for Windows.
  • 2D synthetic video driver: Gives great 2D video performance for Linux guests, and solves earlier problems with duplicate mouse pointers.
  • VMbus protocol updates: Linux guests have the ability to spread interrupts across multiple virtual CPUs for better performance, just like for Windows.
  • Kexec: Linux guests running in Hyper-V can get crash dumps, just like on physical hardware.

These enhancements and others are described in more technical detail on the Hyper-V Virtualization blog.

Going forward, Microsoft will continue the cycle of enhancing the Linux Integration Services to match new Hyper-V capabilities, contributing the enhancements to the Linux kernel through the community process, and then working with distribution vendors to incorporate the latest LIS into new Linux distribution versions. As a result, IT pros can be confident in Microsoft’s commitment to offering a unified infrastructure and to helping you and your customers to reduce cost and complexity. Also keep in mind that all the work we do in Windows and Hyper-V is applicable and consistent across the Microsoft clouds: Private, hosted and Windows Azure. Because Windows Server and Hyper-V are the foundation of Windows Azure, all our investments directly apply.

Manage Heterogeneous Environments Using Standards and System Center

Consider another scenario: As an infrastructure administrator for a large hosting company, or for an IT department within an enterprise organization, you have to manage those Windows and Linux guest VMs and associated applications that are running on Hyper-V. Most likely, you also have physical computers running Windows, Linux, or UNIX that haven’t been virtualized. As with the core execution layer provided by Hyper-V, you’d like to have consistent management of these different operating systems, and consistent management of the different “hardware” – whether it be virtual or physical. You don’t want different consoles, different tools, and different processes/procedures for the different operating systems and hardware. Most importantly, you don’t want your customers to see these differences. Management represents the second major investment area of OSS enablement.

To support a single infrastructure that runs and manages Windows and Linux, we bet on standard-based management using CIM (Common Information Model) and WS-Man (Web Services for Management). At the heart of this bet is the work we are driving with the industry on the Data Center Abstraction Layer (DAL) to provide a common management abstraction for all the resources of a data center to make it simple and easy to adopt and deploy cloud computing. The DAL is not specific to one operating system; it benefits Linux cloud computing efforts every bit as much as Windows. The DAL uses the existing DMTF standards-based management stack to manage all the resources of a data center. To support the DAL, Microsoft has contributed Open Management Infrastructure (OMI) as an open source implementation of these standards along with a set of providers for managing Linux. We built OMI from the ground up to support Linux natively and provide the rich functionality, performance and scale traits needed in a Linux CIMOM.

For our customers, we wanted to make managing Linux and any CIM-based system simple to automate via PowerShell. We introduced the PowerShell CIM cmdlets in Windows Server 2012 which enable IT pros to manage CIM based systems natively from Windows.

To learn more about these cmdlets in PowerShell you can type:

Get-Command –Module CimCmlets

System Center builds upon and enhances the core management investments in the platform to deliver consistent management across Windows, Linux, and UNIX. We started on this journey several years ago, and System Center Operations Manager was the first major area of investment, offering Linux/UNIX monitoring more than 4 years ago. Since then, Microsoft has broadened our Linux/UNIX coverage to include Configuration Manager, Virtual Machine Manager, and now, in the System Center 2012 R2 release, Data Protection Manager.

In Operations Manager, nearly all of the functionality available for Windows servers is also available for Linux and UNIX servers: Monitor OS health and performance, monitor log files, monitor line-of-business application, monitor databases and web servers, and audit security relevant events. Going up the software stack, Microsoft supplies management packs for Java application servers, both open source (Tomcat, JBoss) and proprietary (IBM WebSphere and Oracle WebLogic). Partners also supply management packs for other open source software such as MySQL and the Apache HTTP Server. This functionality appears in a single console, with Windows, Linux, and UNIX computers side-by-side so that you get one view of your workloads and applications, as seen here:

1

Similarly, core Configuration Manager functionality is available for Linux and UNIX, including hardware inventory, inventory of installed applications, the ability to distribute and install software packages, and reporting on all of these areas. ConfigMgr can install open source and proprietary software packages to Linux and UNIX in almost any format. ConfigMgr also includes anti-virus agents for all of the Linux distributions managed by Microsoft. Again, Windows, Linux, and UNIX computers appear side-by-side, with one set of concepts and paradigms as highlighted below. This means you spend less time flipping between environments and more time solving real problems.

2

Virtual Machine Manager is the fabric controller, and it is at the heart of a private cloud. It manages Windows guests and Linux guests running on Hyper-V, and it can personalize Linux OS instances during deployment, so that multiple Linux guests can be deployed from a single template (with each guest automatically getting a unique identity, IP address, etc. – just like for sysprep’ed Windows images). For those complex applications with Windows and Linux components, Linux can participate in VMM service templates to deploy a multi-tier service. The service template can be all Linux, or it can be mixture of Linux and Windows tiers. Almost of all the rest of the great capabilities of VMM are agnostic to the guest OS, so live migration and placement, IP address management, network virtualization, and storage management easily work for Linux. With this level of consistency, you will rarely need to worry about whether a virtual machine is running Windows or Linux.

In System Center 2012 R2, Data Protection Manager adds the ability to backup Linux guest VMs running on Hyper-V, again giving you consistency across Windows and Linux. The Linux guest VMs can continue running live – there is no need to pause or suspend them – and DPM will get a file system consistent snapshot of the VM to backup. “File system consistent” means that the Linux file system buffers are automatically flushed via integration with the Linux Integration Services for Hyper-V. This kind of consistency is analogous to application consistency via VSS writers that are available for Windows VMs.

System Center 2012 R2 gives you a single, consistent, systems management infrastructure for private clouds with Windows and Linux, or in your datacenters physical or virtualized infrastructure running Windows, Linux, and UNIX. Applications with Windows and Linux components can be deployed and managed from a single interface, giving you reduced complexity and reduced costs.

Open Source on Windows

In any IT environment, open source is more than just the operating system. You may be using open source components in your applications, whether you are a vendor offering Software-as-a-Service (SaaS) from the cloud, or an enterprise running open source components in your datacenter.

To provide customers with increased flexibility for running open source-based applications on Windows, Microsoft simplified the process for building, deploying and updating services that are built on Windows. This was achieved through the development of a set of tools called “CoApp” (Common Open source Application Publishing Platform), which is a package management system for Windows that is akin to the Advanced Packaging Tool (APT) on Linux.

Using CoApp, developers on Windows can easily manage the dependencies between components that make up an open source application. Developers will notice that many of the core dependencies, such as zlib and OpenSSL, are already built to run on Windows and are available immediately in the NuGet repository. Through NuGet, CoApp-built native packages can be included in Visual Studio projects in exactly the same manner as managed-code packages, making it very easy for a developer to download core libraries and create open source applications on Windows. Those of you with a developer orientation can get more details on CoApp in these videos: GoingNative - Inside NuGet for C++ and Building Native Libraries for NuGet with CoApp’s PowerShell Tools.

We’ve also done great work collaborating with the open source community to ensure specific OSS apps run on, and are optimized for, Windows. For example, consider PHP, which is a foundational component of many content management and publishing applications. Microsoft works in the PHP community to ensure that versions are available which run natively on Windows, right alongside the versions that run on Linux or UNIX. The newest version, PHP 5.5.0, has just been released for Windows on the same day that it was released for other operating systems. The Windows version includes significant performance improvements that deliver functionality that will surprise many.

In addition to all of these improvements, the Azure gallery now includes a broad range of Open Source applications thereby providing customers with ready access to install and run commonly used Open Source software on Azure.

3

Microsoft’s ongoing commitment to supporting Open Source Software has been highlighted recently in two important partnerships:  First, our customers can now run Oracle software on Windows Server Hyper-V and in Windows Azure encompassing Java, Oracle Database, and Oracle WebLogic Server.  This can all happen on Windows Server Hyper-V or Windows Azure in a fully supported mode.  Second, a new Java development kit (JDK) will be available through a partnership with Azul Systems.  This will enable customers to deploy Java applications on Windows Azure using open source Java – on Windows and Linux.

Summary

Enabling open source software is a key part of our promise to support the efforts of our customers as they continue to transform their datacenters with the cloud. This enablement is a key tenet of the scenarios we design and build our products to handle. The features and functions that enable open source software are an integral part of our products, and each element of these products are built and tested by our core engineering teams. These efforts are fully supported by Microsoft.

As you might expect for the “Enable OSS” tenet of this 2012 R2 release, key parts of our open source enablement are themselves open source. For example, the Linux Integration Services are open source in the Linux kernel, and Microsoft releases the source code for most of the agents that System Center uses on Linux and UNIX to provide management capabilities. OMI and CoApp are also an open source projects, and, of course, PHP on Windows is part of the PHP open source project.

With this release Microsoft is clearly the choice for datacenter infrastructure if you require the ability to run and manage open source software alongside Windows.

* * *

This post has covered one key trend in the datacenter (the need for one infrastructure and management solution for Windows and Linux), and next week we’ll examine another critical element: The need for enterprises to act more like service providers.

A core requirement for this trend is for enterprises to deploy and operate an Infrastructure-as-a-Service. In next week’s post, we’ll examine the infrastructure and experience improvements we’ve made across Windows and System Center to enable this scenario for customers.

- Brad

 

NEXT STEPS:

To learn more about the topics covered in this post, check out the following articles:

 

    28 Jul 19:15

    Interval Queries in SQL Server Part 5

    by Dejan Sarka

    This is the fifth part of the solutions for interval queries in SQL Server. However, it is empty for nowL For an introduction, please refer to the blog post Interval Queries in SQL Server Part 1 (http://blogs.solidq.com/dsarka/Post.aspx?ID=149&title=Interval+Queries+in+SQL+Server+Part+1). You can find the second part of the solutions in the blog post Interval Queries in SQL Server Part 2 (http://blogs.solidq.com/dsarka/Post.aspx?ID=150&title=Interval+Queries+in+SQL+Server+Part+2), the third part in the blog post Interval Queries in SQL Server Part 3 (http://blogs.solidq.com/dsarka/Post.aspx?ID=151&title=Interval+Queries+in+SQL+Server+Part+3), and the fourth part in the blog post Interval Queries in SQL Server Part 4 (http://blogs.solidq.com/dsarka/Post.aspx?ID=152&title=Interval+Queries+in+SQL+Server+Part+4). Note that you also need to read an excellent article by Itzik Ben-Gan wrote on interval queries in SQL Server (http://sqlmag.com/t-sql/sql-server-interval-queries) by using the Relational Interval Tree model. I am using the tables and data Itzik has prepared. In order to test the solutions, you can download the code from Itzik's article by using the link in this paragraph.

    My friend Itzik found very quickly an error in this code. However, this is not my fault. I blame the beer. Yesterday I was sitting for a long time with Matija Lah. All of the time I was complaining that I have a bad feeling about this solution - it seemed too simple. And instead of going home and trying to check and fix it, I just ordered another beer and kept complainingJ Anyway, with Itzik, now we are taking vacations and going to Ireland for a hike, and we are going to try to make this solution working. I will leave this post for now, as a warning to those who read the erroneous one. In the future, there are two possible outcomes:

    1. If this post gets some meat, it means that we found a solution.
    2. If this post disappears, it means that we did not find a solution, and that I am trying to pretend it was never publishedJ
    28 Jul 19:15

    Crash when inserting to a FileTable with a parallel plan

    Whilst doing some ad-hoc work on some data on a 2012 SP1 instance I came across an issue with the SQL server raising a sev 20 error with a script that I was attempting to run. The work that I was doing was that I was exporting some binary data (basic text files) that is stored in one table to the file system via a filetable within the same database. It came as bit of a surprise when I saw this error message:

    Location:     "xact.cpp":4175
    Expression:     !m_updNestedXactCnt
    SPID:         88
    Process ID:     15004

    Description:     Trying to use the transaction while there are 1 parallel nested xacts outstanding
    Msg 3624, Level 20, State 1, Line 1
    A system assertion check has failed. Check the SQL Server error log for details. Typically, an assertion failure is caused by a software bug or data corruption. To check for database corruption, consider running DBCC CHECKDB. If you agreed to send dumps to Microsoft during setup, a mini dump will be sent to Microsoft. An update might be available from Microsoft in the latest Service Pack or in a QFE from Technical Support.
    Msg 0, Level 20, State 0, Line 0
    A severe error occurred on the current command.  The results, if any, should be discarded.

    The query that I had developed was nothing special and only consisted of an INSERT..SELECT statement that had a few joins. Naturally at this point I’m reaching for google to see what else I could find out about this error which wasn’t an awful lot. So I opened up my very first connect ticket! I had to complete the task at hand fairly quickly so my first attempt at circumventing this bug was to create a temp table and insert the data into there first before inserting the contents of the temp table directly into the filetable. This actually worked fine and the error wasn’t raised by using this method. Furthermore, further testing showed that using the MAXDOP 1 hint on the original query I developed also eliminated the error.

    Afterwards, I thought I would try to develop a repro script but at first I had trouble generating the error again. It was only after plenty of trial and error that I realised that I could only get the error to occur when 1) SQL choose a parallel plan 2) the filetable that I was inserting into was actually referenced as part of the SELECT query and 3) with a certain amount of data in the base tables. I was joining in the filetable to the query as I wanted to export the data into certain folders and I need the path_locator value as explained in this blog by Bob Beauchemin

    This is the query that I'm using in the repro. Table4 is the filetable:

    INSERT INTO [dbo].[Table4] (name, file_stream, path_locator) 
    SELECT t3.xName, CAST(t3.xData AS VARBINARY(MAX))
    , path_locator.ToString() +  convert(varchar(20), convert(bigint, substring(convert(binary(16), newid()), 1, 6))) + '.'
                                 +  convert(varchar(20), convert(bigint, substring(convert(binary(16), newid()), 7, 6))) + '.'
                                 +  convert(varchar(20), convert(bigint, substring(convert(binary(16), newid()), 13, 4))) + '/'
    FROM [dbo].[Table1] t1
    INNER JOIN [dbo].[Table2] t2 ON t1.Id = t2.xId  
    INNER JOIN [dbo].[Table3] t3 ON t2.Id = t3.xId
    INNER JOIN [dbo].[Table4] t4 ON t4.name = t2.Name
    WHERE t2.zId = 35
    

    For any of those wanting to try this out yourselves, here is a full repro script but I can't guarantee that this will always crash. It does for me on two 2012 instances. I'm not going to show it in its entirety here as there is quite a bit of code/data to actually get the error to surface. Additionally as part of the repro, to get the optimiser to choose a parallel plan I’ve had to use the undocumented DBCC SETCPUWEIGHT as mentioned in this blog by Paul White. One of the tables referenced in the original query is about 30gb in production which may have been a factor why SQL choose a parallel plan as that was the main difference between the production and test boxes. I just couldn’t get SQL to produce a parallel plan during my testing otherwise even with identical schemas and identical data in all other tables bar this larger table.

    Needless to say, don't run the repro in your production!

    Enjoy!

    28 Jul 19:15

    July, the 31 Days of SQL Server DMO’s – Day 25 (sys.dm_db_missing_index_details)

    by Tamarick Hill

    The sys.dm_db_missing_index_details Dynamic Management View is used to return information about missing indexes on your SQL Server instances. These indexes are ones that the optimizer has identified as indexes it would like to use but did not have. You may also see these same indexes indicated in other tools such as query execution plans or the Database tuning advisor. Let’s execute this DMV so we can review the information it provides us. I do not have any missing index information for my AdventureWorks2012 database, but for the purposes of illustrating the result set of this DMV, I will present the results from my msdb database.

    SELECT * FROM sys.dm_db_missing_index_details

    image

    The first column presented is the index_handle which uniquely identifies a particular missing index. The next two columns represent the database_id and the object_id for the particular table in question. Next is the ‘equality_columns’ column which gives you a list of columns (comma separated) that would be beneficial to the optimizer for equality operations. By equality operation I mean for any queries that would use a filter or join condition such as WHERE A = B. The next column, ‘inequality_columns’, gives you a comma separated list of columns that would be beneficial to the optimizer for inequality operations. An inequality operation is anything other than A = B. For example, “WHERE A != B”, “WHERE A > B”, “WHERE A < B”, and “WHERE A <> B” would all qualify as inequality. Next is the ‘included_columns’ column which list all columns that would be beneficial to the optimizer for purposes of providing a covering index and preventing key/bookmark lookups. Lastly is the ‘statement’ column which lists the name of the table where the index is missing.

    This DMV can help you identify potential indexes that could be added to improve the performance of your system. However, I will advise you not to just take the output of this DMV and create an index for everything you see. Everything listed here should be analyzed and then tested on a Development or Test system before implementing into a Production environment.

    For more information on this DMV, please see the below Books Online link:

    http://msdn.microsoft.com/en-us/library/ms345434.aspx

    Follow me on Twitter @PrimeTimeDBA

    28 Jul 19:15

    TechEd Europe 2013 videos online

    by James Serra

    Microsoft has posted ALL the session videos for TechEd Europe 2013 and they are available for free!  Some great videos on the latest products and versions.  Check them out here.  There are four videos on SQL Server 2014:

    Microsoft SQL Server 2014 In-Memory OLTP: Overview

    Microsoft SQL Server 2014: Future and Features

    Microsoft SQL Server 2014 In-Memory OLTP: DB Developer Deep Dive

    Microsoft SQL Server 2014 In-Memory OLTP: Management Deep Dive

    28 Jul 19:06

    Retail Stores Plan Elaborate Ways To Track You

    by timothy
    Velcroman1 writes "Retailers are experimenting with a variety of new ways to track you, so that when you pick up a shirt, you might get a message about the matching shorts. Or pick up golf shoes at a sports store and you see a discount for a new set of clubs. New technologies like magnetic field detection, Bluetooth Low Energy, sonic pulses, and even transmissions from the in-store lights can tell when you enter a store, where you go, and how you shop. Just last year, tracking was only accurate within 100 feet. Starting this year, they can track within a few feet. ByteLight makes the lighting tech, which transmits a unique signal that the camera in your phone can read. The store can then track your location within about 3 feet — and it's already in use at the Museum of Science in Boston."

    Share on Google+

    Read more of this story at Slashdot.



    27 Jul 04:34

    Buffer Pool Extension to SSDs in SQL Server 2014

    by SQL Server Team

    How many of you have been looking into ways of cost effectively improving the performance of your OLTP workloads, without having to buy new more expensive hardware? Most of you know already that memory is one of the key assets available for the SQL Server and especially buffer pool memory, which plays key role in storing data pages and in turn reducing the I/O demand. Have many of you seen your servers struggling from the workloads intensity increase that generated memory pressure and in turn exceeded planned capacity?

    Did you see the machine level limitations that makes it hard to upgrade memory in those machines and you have been thinking that it would be great if you only could upgrade your memory in the same easy way you can upgrade your storage options?

    You are not alone. We have heard similar feedback from many customers, so we tried to address it in the way that would allow you to avoid significant changes in your hardware or create cost-effective mid-size hardware configurations for OLTP workloads from the scratch.

    The answer we come up with is Buffer Pool Extension feature (BPE) that targets nonvolatile storage devices, in particular SSD drives, as an extension for SQL Server buffer pool. Those devices have some advantages over direct memory increase: the first one, already mentioned above, is the higher flexibility of storage options over memory options; the second, might be even more important, is the greater price efficiency of available storage vs. memory.

    What does Buffer Pool Extension offer?

    A few things:

    • Performance gains on OLTP workloads, mostly on the read-heavy OLTP.
    • No risk of data loss. BPE only deals with clean pages.
    • No application modification required. Just enable the feature and you’re ready to go.
    • Simple feature syntax

    Let’s start from the last bullet and take a look on the syntax. Assuming you already have SSD drives ready and configured in Windows the only syntax you need to be aware of is

    ALTER SERVER CONFIGURATION 
    SET BUFFER POOL EXTENSION
    { ON ( FILENAME = 'os_file_path_and_name' , SIZE = <size> [ KB | MB | GB ] )
    | OFF }

    Note: ALTER SERVER CONFIGURATION command requires ALTER SERVER STATE privilege from the user executing the command.

    Yes, this is it. This is how it translates into an actual example:

    ALTER SERVER CONFIGURATION
    SET BUFFER POOL EXTENSION ON
    SIZE = 50 GB
    FILENAME = 'F:\SSDCACHE\EXAMPLE.BPE'

    Looking at the example and going up by the list you can see that there is no additional database or application level syntax available for the feature. Just keep running your workloads as they are currently.

    What does it mean that there is no risk of data loss? BPE only processes clean pages, which ensures that all data in the extension is already committed. Machine crash or power failure? No additional risk compared to the data in memory. What about SSD storage failure? The feature disables itself automatically and can be either re-enabled manually in the same session or attempts to re-enable automatically on instance restart.

    Performance gain is always the trickier subject, mostly because in the real world no two workloads are alike and no two configurations are the same. However, there are some basic sweet spot recommendations you can start with:

    • Up to 32GB RAM available for SQL Server
    • Extension file sized 4x-10x of the memory size available to SQL Server
    • High throughput SSD storage on the machine. We also suggest using high endurance storage, despite the endurance being rarely directly related to performance

    However, we all know that a silver bullet doesn’t always work in the real world, so there are some situations when your expectations should be limited:

    • Data warehouse workloads.
    • Write-heavy OLTP workloads.
    • Machines with more than 64GB of memory available to SQL Server

    Now let’s take a look on the little more complicated scenario that would also allow us to discuss one more interesting point. Let’s assume you have BPE configured, but you don’t like its configuration. The only option available to you is to turn it OFF and then back ON.

    ALTER SERVER CONFIGURATION
    SET BUFFER POOL EXTENSION OFF
    GO
    EXEC sp_configure 'max server memory (MB)', 12000
    GO
    RECONFIGURE
    GO
    ALTER SERVER CONFIGURATION
    SET BUFFER POOL EXTENSION ON
    SIZE = 40 GB
    FILENAME = 'F:\SSDCACHE\EXAMPLE.BPE'

    This is the thing to keep in mind – when you turn OFF the BPE feature it automatically and most likely significantly reduces the amount of addressable memory in your instance’s buffer pool. This will immediately cause increased memory pressure, then increased I/O pressure, and then, in turn, performance drop. Make sure you keep this in mind when updating BPE configuration.

    Now you are ready to start using the feature. All that you need is the SQL Server Enterprise on x64 architecture.

    For those who are interested in the technical details regarding this feature, the buffer pool extension is a "write-through" cache which is an extension to the Dual Write (DW) mechanism that is described in the paper by Do et al [1].

    I hope you will like the feature and let us know what you think. You can download SQL Server 2014 CTP1 here.

    Evgeny Krivosheev,
    SQL Server Program Manager

    Further Reading:

    [1]  "Turbocharging DBMS Buffer Pool Using SSDs",   Do, J., Zhang, D., Patel, J.,  DeWitt, D., Naughton, J., and A. Halverson,  Proceedings of the 2011 ACM SIGMOD Conference,  Athens, Greece,   June 2011.

    27 Jul 04:33

    Interval Queries in SQL Server – Wrap Up

    by Dejan Sarka

    Because SQL Server does not have any support for temporal data out of the box, there are always problems with this kind of data. The most important problem is the performance. Traditional solutions have to deal with slow queries, which also make slow constraints implemented through triggers.

    Itzik Ben-Gan wrote an excellent article on interval queries in SQL Server by using the Relational Interval Tree model. Based on the model developed by Kriegel, Pötke, and Seidl, and enhanced by Martin, Itzik fully developed a T-SQL solution. Davide Mauri proposed usage of spatial data types. I added three additional solutions for fast querying by using just an enhanced WHERE clause in the traditional T-SQL solution, by using the unpacked form of the intervals in an indexed view or an additional table, and by using the IntervalCID CLR data type. In this post, I am simply collecting all of the links you might need to check all of the text and the code and select the best solution for you. Interesting links include:

    As you can see, there is no excuse for slow queries over intervals anymore. In addition, I really hope that Microsoft is considering seriously the connect item and plans to add the temporal data support in SQL Server as soon as possible.

    27 Jul 04:33

    The SQLAndy Rules for Storing Credit Card Numbers

    by Andy Warren

    I’ll try to write more about these over the next couple months, but I wanted to write down a quick reference for those that have to deal with storing credit card numbers. These rules cover the basics – the full topic of protecting card data is easily a book or two. These are my rules, when in doubt consult with your compliance team for final guidance.

    1. Don’t Store Them! PCI compliance is complex and expensive. Getting hacked and dealing with the consequences is more complex and more expensive. Offload the problem to a vendor.
    2. Encrypt The Card Number. If you insist on storing credit card numbers they must be encrypted at rest and you have to document a key management plan that will make your head hurt. Remember this applies to more than just databases – it’s batch processing and voice recordings too. No home grown XOR stuff either, you need big dog encryption.
    3. Do NOT Store Track Data, CVV, CVV2, or PIN. Store any of these and you are automatically “non-compliant” as far as PCI goes. Beyond that, you’re taking a risk that cannot be justified when a breach occurs.
    4. Tokenize Card Numbers. Tokens “don’t count” as long as the tokenization method used is sound. Tokens let you do things like process cancellations without needing the ‘real’ card number.  Tokens do not need to be encrypted.
    5. You Can Store Some Things. You can store first six/last four characters of the card number. You can store the expiration date and the card holder name. None of those require encryption. Don’t store it if you don’t need it.
    6. Log All Access. If someone stores OR retrieves a card number you need to know who, why, when, where, what, and how. If you use tokens, do the same for the tokenize/de-tokenize calls. Logs are incredibly important. They allow you to monitor and potentially detect hackers, and they are your best chance of determining the scope of a breach when it happens. Don’t skimp on this. You need 90 days online and the ability to get the past 12 months in a relatively short time.
    7. NULL When Done. As soon as you know you won’t need the card number, null it out or replace with some type of non-reversible artifact (xxxxxxxx-1234). PCI doesn’t require this. Common sense does.
    8. Never Use Live Cards In Development. Never. Even if you’re using encryption never land valid card numbers in any type of development/non-prod/non-PCI environment. That means no direct backup prod/restore to dev type operations, you have to change/erase those card numbers before it lands on disk in dev.
    9. Mind Your Tools. Profiler, third party monitoring tools, SSIS packages, they can all potentially capture and store card data as part of normal and seemingly routine activities. All the same rules apply to that data store too.
    27 Jul 04:32

    TechEd Europe 2013 Recap

    by MVP Award Program

    Editor’s note: The following post was written by Silverlight MVP Tony Champion

    WP_20130627_001

    Well it’s the day after TechEd Europe 2013 and I have just finished up a day of sight seeing in Madrid. Let me just say that it is one beautiful city. Being a Texas boy, we simply don’t have cities with buildings this old and ornate. It is always quite interesting to see the blending of the old and the new. Sometimes it has an amazing effect and sometimes, well, maybe not so much.WP_20130628_001

    The convention center, IFEMA, is a beautiful open air concept that allows you to get a little fresh air in between sessions. As with most TechEds, the facility was large enough it could easily take 10 to 15 minutes to get from place to place. However, the outdoor strolls made those enjoyable enough, even when it heated up during the day. I have to admit, I heard several comments about the heat in mid-day, but it was just another summer day for this Houston boy. At night it cooled off rather nicely, so every morning was spectacular. I do have to wonder about the 20+ ft tall gummy bears that stood guard at the entrance.

    I was forewarned about the late dinners you have in Madrid. It seems that 9 to 10pm is a great time for dinner. This seemed fairly strange and can mess with ya a bit if you are use to eating several hours earlier in the day. However, I soon found out the sun didn’t set until 10pm, so it was really just the end of the day here in Madrid and that made a lot more sense. Go figure. Food in a different country is always interesting. We found some really fantastic food and some that I can only call “interesting”. Again, being a Texas boy, my limited Spanish is all Mexican Spanish. It seems there are some differences between Spain and Mexico. Let’s just say I learned a lot about the word tortilla and its different implementations in different parts of the world.

    Spain has some amazing wines to enjoy and we might have tried one or two. The thing I found humorous is that almost every place you went into only had one type of beer. So you could easily simply ask for a “cervesa” and be ok. That’s a far cry from some of the large beer gardens we have in the States with over 100 types of beer on tap.

    Overall, I truly enjoyed my trip to Spain and appreciate all of the patience shown to us non-Spanish speaking geeks. But this is a recap on TechEd and not a travel post. So on to the geeky stuff.

    TechEd always has around a 75/25 split of IT and developers. It’s arguable that this year it was a little lighter on the developer side. However, between all of the great things going on with Azure, the Windows Phone platform, and the release of the Windows 8.1 preview at Build this week, there was a lot of excitement in the developer community. That’s not to say there wasn’t any on the IT side, but most of my interactions were with the developer community.

    There was nothing new released this week, and I’ll save the new Windows 8.1 stuff for future posts. The attendees were given a summary of some of the new 8.1 features from Joe Stegman, Group Program Manager for the Windows UI Platform team. In fact, Joe hung around Ask the Experts and gave some great insight on some of the decisions and challenges that the UI team faced when dealing with feature requests.

    I was asked at TechEd North America, and again this week, about what was the best way to get the most of your TechEd experience. And I think the answer to that is simple: “Take advantage of the conversations”. Now as a speaker, of course I want you to come to my session. In fact, if you want to stop by just to give me a glowing evaluation, that is ok too. :) But the real advantage of a conference is the people. How often to you get a chance to meet up with fellow developers from all over the world? When do you get a chance to talk with people from the team that creates your favorite product? How often can you open up your machine and trouble shoot whatever problem you are facing with some of the most experienced people in the industry? What ever you do, don’t just spend your time in the sessions and never ask the presenters questions, or go to the Expo, or events like Asks the Experts. If that is all you are looking to get out of a conference, then you can watch the sessions online the following week,

    And since I’m on my soap box on the subject, let me give you one little bit of insight on this. The presenters, product team members, and staff spend an enormous amount of time prepping for these things. As a presenter, my first priority is to deliver comprehensive sessions on my topics. However, the thing I enjoy most is getting to have conversations with other developers on those, and quite frankly any, topics. I like to see how people are using the tools and I always learn about new things and new challenges facing developers. The product team members like to here feedback on their products, even the stuff that isn’t the most flattering. They enjoy hearing about how people are using their products and what challenges they are facing. Believe it or not, but this feedback helps to guide the direction of their products.

    So get out there and jump into the conversation. It’s always a good time. It was a blast getting to be a part of TechEd Europe and I hope I get the opportunity to do it again. I have a few more conferences coming up this year, so I hope to see you all around and I look forward to speaking with each and every one of you.

    Now it’s time to enjoy my last evening in Madrid. Buenas noches….

    WP_20130629_020

    About the author

    photo

    Tony Champion is a software architect with over 16 years of experience developing with Microsoft technologies. As the president of Champion DS and its lead software architect, he remains active in the latest trends and technologies, creating custom solutions on Microsoft platforms. Tony is an active participant in the community as a Microsoft MVP, international speaker, published author, and blogger. He focuses on a wide range of technologies from web solutions such as HTML5, JavaScript, and Silverlight to client platforms for Windows Phone and Windows 8. He can be found on his blog and on Twitter.

    27 Jul 01:41

    CodeSOD: SQL Injection Not Found

    by Remy Porter

    The bug ticket complained, “When I try and update a certain page in the CMS, I get a ‘file not found’ error.” It included more details, explaining what page in specific was the culprit, but Michael was still confused. This application had been in use for over a decade, and no one had ever had a complaint like this. He also couldn’t replicate it, at least, not until he got the user to provide the specific text they were trying to use in the update.

    As soon as he noticed the sentence: “Stop; declare your intent!”, he realized it must have something to do with their SQL injection protection .

    
    'check for bad strings
    function checkforbad(str)
    	dim bad
    	bad = false
    	'get rid of spaces
    	str = replace(str,"%20"," ")
    	str = replace(str,"+"," ")
    	str = replace(str," ","")
    	
    	if instr(1, str,";DECLARE", 1) > 0 then bad= true
    	if instr(1, str,";SELECT", 1) > 0 then bad= true
    	if instr(1, str,";INSERT", 1) > 0 then bad= true
    	if instr(1, str,";UPDATE", 1) > 0 then bad= true
    	if instr(1, str,";DELETE", 1) > 0 then bad= true
    	if instr(1, str,";ALTER", 1) > 0 then bad= true
    	if instr(1, str,";DROP", 1) > 0 then bad= true
    	if instr(1, str,";CREATE", 1) > 0 then bad= true
    	if instr(1, str,";EXEC", 1) > 0 then bad= true
    	if instr(1, str,";TRUNCATE", 1) > 0 then bad= true
    	
    	checkforbad = bad
    end function

    But why on Earth was the error a 404 and not a, “Hey, don’t do this!” message, or at least a 500?

    function checkforsqlinjection()
    	dim stoppage
    	stoppage = false
    	if checkforbad(urldecode(request.querystring())) then stoppage = true
    	if checkforbad(urldecode(request.form())) then stoppage = true
    
    	'so if there issql injection type code going on give them a 404 and don't go any further
    	if stoppage then
    		Response.Status = 404
    		Response.End
    	end if
    
    end function
    [Advertisement] Make your team a DevOps team with BuildMaster. Pairing an easy-to-use web UI with a free base platform, BuildMaster gets you started in minutes. See how Allrecipes.com and others use BuildMaster to automate their software delivery.
    27 Jul 01:39

    Outlet to the Danger Zone

    by Dan Adams-Jacobson

    Chris Q had a reputation for being a bit of a maverick. He didn't make changes directly to production, dare the two-week-old butter chicken in the back of the lunchroom fridge, or even particularly like 1986's Tom Cruise / Val Kilmer vehicle Top Gun. But as part of an elite development team that had split from Government Department's mainframe dev group, Chris couldn't help being branded "other". When he walked by, the old-school mainframe developers whispered: there went a guy who thought dangerously out of the box.

    The original reason for the split was called The Internet. As the Thermodynamic Arrow of Time dragged them relentlessly towards the heat-death of the universe, Government Department needed a web presence and a modern intranet. While the budgetary committee would have been perfectly happy to somehow run the new system on the Department's existing mainframes, sanity prevailed (this time) and a new-ish PC server machine landed on Chris's doorstep. Since a majority of the senior-most developers wanted nothing to do with the newfangled equipment, the PC / Server Team was formed. While having perhaps not as much experience as the mainframe developers, Chris and co. knew their server needed an uninterruptible power supply and regular disaster-recovery testing. Every quarter, at a scheduled time, they would pull the plug on the UPS and ensure the server shut down gracefully before the batteries died. And the server always did.

    Everything was fine until that pesky Arrow of Time saw fit to move Government Department forward a couple years and into new offices. The "mainframe" support team had since added (or, some said, been forced to add) modern servers to their roster, running things like their new helpdesk system. Though the PC / Server Team and their senior counterparts remained separate in mind, body, and certainly in coding conventions, their hardware huddled close together in the new office's compact server room. All was well... for a few months. When quarterly testing time arrived, Chris pulled the plug on their UPS. As expected, their server logged the UPS warning messages and did a proper shutdown. Once the batteries were flat, Chris would record the maximum emergency window provided by the UPS, and he was just reaching for his logbook when he heard the screaming.

    The mainframe support team poured into the server room, looming over Chris. Their systems were crashing! What had he done?!

    It was a tense moment, but Chris - that maverick - answered a question with a question: had they plugged a server into his team's UPS? They had: three of them, in fact. After all, the three empty outlets on the power supply, blaze orange as they were, looked so inviting. And, Chris - thinking outside the box - followed with another question, had they installed UPS-monitoring software on those servers? They had not. So while Chris's server had realized it was on battery power and began its shutdown sequence, theirs had cheerfully dawdled until darkness fell.

    That was how Chris wound up giving Government Department's senior developers a riveting Powerpoint presentation on how to properly use a UPS with their servers. After the talk, one of the mainframe gurus stopped Chris and shook his hand.

    "You're still dangerous, maverick... but you can be my wingman any time."

    [Advertisement] Make your team a DevOps team with BuildMaster. Pairing an easy-to-use web UI with a free base platform, BuildMaster gets you started in minutes. See how Allrecipes.com and others use BuildMaster to automate their software delivery.
    27 Jul 01:35

    The Circle of Fail

    by Charles Robinson

    During Ulrich’s days as an undergraduate, he landed a part-time gig at a nuclear power plant. It was an anxious time to be on board at the nuke plant- the late 1990s. The dreaded Y2K loomed over all of their aging systems. One decimal point in the wrong spot at midnight on January 1st, 2000 and… well, nothing good would come of it.

    Ulrich’s job for the big conversion was more benign though. He needed to update the simple graphics on the monitoring program the nuclear technicians used to keep tabs on the reactor. The very basic macro language generated Commodore 64-quality graphics; it displayed the position of the control rods, neutron flux, water temperatures & pressure, turbine and generator stats, and how many three-eyed fish were caught in the neighboring lake. All of this was then shown on 10 massive CRT monitors mounted around the main control room.

    Ulrich worked diligently to get his screens prepared, and the day came for him to roll out the changes. They didn’t have a “test control room”, so the demo needed to be run live. He invited the engineers to gather ’round the monitors to see his spectacular new designs. When the program booted and Ulrich went to pull up the control rod screen, all 10 monitors went as black as the cloak on a member of the Night’s Watch. As the engineers chuckled, Ulrich turned bright red and ran back to the server room to see what happened. It didn’t take him long to realize that whatever he screwed up caused the entire mainframe to go down.

    Thus began a two-week battle to troubleshoot the mainframe issue, during which time the computer monitoring was completely unavailable. This caused the nuclear technicians to have to leave their air conditioned control room so they could use primitive analog monitoring tools from the 1970’s to check on the reactor. Every time Ulrich walked past one of them, he could sense them glaring and thinking “There’s that little pipsqueak that killed the monitors!”

    The tools Ulrich had to debug the program weren’t merely useless to him. They went beyond uselessness into outright opposition. The custom macro-language had no debugger or real documentation. The mainframe was purchased from the Czech Republic and one would have to know Czech in order to read the error logs. He was able to locate a sticker on top of the server with the phone number of the vendor. He was able to reach one of their ‘experts’ named Miklos, who asked him for the serial number of the product. Ulrich provided it but the expert retorted “That is not full number! This is too short. What you need help with? Toaster? Coffee maker?”

    Confused, Ulrich replied, “Ummm, a mainframe?” Had the nuclear plant bought their server from some sort of Czech Coffee, Toaster, and Mainframe Corp.? Miklos said “Oh no, Miklos can not help you. I give you number for Blazej. He does help with mainframe.” Blazej was an engineer at another nuclear power plant in the Czech Republic, who also had the same mainframe. Ulrich called there, not expecting much.

    Through a series of conversations with Blazej, Ulrich was able to finally narrow down the problem to the presence of circles in the screen outputs. Apparently drawing fancy circles was far too much for the monitoring program to handle. He removed all the circles from his screens, uploaded the changes to the mainframe and finally the engineers could see the reactor statistics on the bright, beautiful monitors; without any circles. The result was ugly, boxy, and barely readable, but it worked. Ulrich breathed a sigh of relief then decided to call Czech Coffee, Toaster, and Mainframe Corp. back to notify them of the horrible bug in their program.

    Ulrich once again got connected to his buddy Miklos. “Hi Miklos, this is Ulrich. I called a while back concerning our power plant monitoring program crashing the mainframe. You’ll be glad to know that Blazej and I were able to determine the problem. It all had to do with circles being drawn on the screen. I know it sounds silly, but that causes the whole mainframe to come down.”

    Miklos seemed to be offended by such an accusation. “You do a circle and server come down? You want Miklos to fix this? You stupid? If you know circle cause trouble, then DO NOT USE CIRCLE!” Miklos abruptly hung up. Ulrich shrugged it off since his job was done. He eventually finished his undergrad program before Y2K and moved on from the nuclear power plant. When New Years 2000 rolled around, he made sure he was far, far away at a ski resort just in case anyone else slipped a circle into the graphics and the plant melted down as a result.

    Nuclear Reactor image from the public domain
    [Advertisement] Make your team a DevOps team with BuildMaster. Pairing an easy-to-use web UI with a free base platform, BuildMaster gets you started in minutes. See how Allrecipes.com and others use BuildMaster to automate their software delivery.
    16 Jul 16:55

    CodeSOD: PHP Doesn't Have Date Functions Either

    by Dan Adams-Jacobson

    We recently brought you the touching story of Shaun, and his coworker's mistaken belief that Perl has no built-in methods for working with dates. Well, Shaun can rest easy: he's not alone.

    Apparently, PHP can't handle dates either.

    Whether you're Shaun or Chris M, working for an email-marketing company whose clients needed some tweaks to their software integration, inheriting a codebase is always the same: like inheriting a teetering cliffside mansion from your eccentric grand-uncle, the hair on the nape of your neck seems to tickle as you reach for the doorknob. Something primordial reaching up into your consciousness, warning you. But you always go in, despite the horror doubtlessly lurking inside... Chris had just finished recovering from exposure to a custom, PHP-delimited file importer when he received his new inheritance from a fellow consultant. Among the first things Chris learned from inspecing the code was that, if you abuse a database long enough, you can force it to reveal the next 31 days.

    First, insert dates into a database table:

    INSERT IGNORE INTO Dates (Dates) VALUES
    (DATE_ADD(CURDATE(), INTERVAL+1 DAY));
    ;
    
    INSERT IGNORE INTO Dates (Dates) VALUES
    (DATE_ADD(CURDATE(), INTERVAL+2 DAY));
    ;
    
    INSERT IGNORE INTO Dates (Dates) VALUES
    (DATE_ADD(CURDATE(), INTERVAL+3 DAY));
    ;
    
    ...<snip>...
    
    INSERT IGNORE INTO Dates (Dates) VALUES
    (DATE_ADD(CURDATE(), INTERVAL+32 DAY));
    ;
    

    Then, select the next 31 days and their difference from the current date, so we can iterate over them back in PHP Land - a benighted, backwards place where not a single tool to manipulate a date has ever been even theorized.

    $sql=<<<SQL
    SELECT Dates, DATEDIFF(Dates, CURDATE()) AS `datediff` 
    FROM Dates 
    WHERE DATEDIFF(Dates, CURDATE()) < 32 AND DATEDIFF(Dates, CURDATE()) > -1 ; 
    SQL; 
    
    $result = $db->query($sql);
    
    while ($row = $result->fetch()) {
        ...<snip>...
    }
    

    His mind left fragile by its tango with the previous PHP perversion, Chris pulled a colleague in to make sure the above wasn't some sort of hallucination. The other developer offered that perhaps the code's author had understood SQL better than PHP. While Chris appreciated the effort, the thirty-two INSERTs into the same field in the same table made that justification seem unlikely. But it did make him feel better enough to replace the lot with this, leaving the poor database to nurse its wounds in peace:

    for ($i = 0; $i <= 31; $i++) {
        $date = date('Y-m-d', strtotime("+$i days"));
        ...<snip>...
    }
    

    Remember, developers: before you reach for the Swiss Army knife of your nearest database, do poor Chris and Shaun a favour and consider that your programming language of choice just might be able to do things with dates all by itself.

    [Advertisement] Make your team a DevOps team with BuildMaster. Pairing an easy-to-use web UI with a free base platform, BuildMaster gets you started in minutes. See how Allrecipes.com and others use BuildMaster to automate their software delivery.
    16 Jul 16:51

    July, the 31 Days of SQL Server DMO’s – Day 16 (sys.dm_sql_referenced_entities)

    by Tamarick Hill

    The sys.dm_sql_referenced_entities Dynamic Management Function returns a result set of all objects that are being referenced as part of the definition of the object that you reference within the function argument section. You can pass in either a Server trigger, database trigger, or any object  such as a table or view. To better illustrate how this function works lets look at a stored procedure and a view in the AdventureWorks2012 database.

    If we script out the uspGetEmployeeManagers stored procedure we will see the code in the screenshot below.

    image

    If we needed to know all of the tables, columns, views, functions, etc. that are referenced in this stored procedure, we could easily just read through the code since its not too long.  But what if the code was 1000 lines worth of complex code? It would be difficult to gather a list of all dependent objects without running the risk of missing something. This is where the sys.dm_sql_referenced_entities function comes in handy. If we run a quick query against this function, we can immediately see the benefit:

    SELECT referenced_schema_name, referenced_entity_name, referenced_minor_name,
           referenced_class_desc, is_ambiguous, is_selected, is_updated, is_select_all
    FROM sys.dm_sql_referenced_entities ('dbo.uspGetEmployeeManagers', 'OBJECT')

    image

    As we see from the result set, this function returned all of the objects that are referenced in the stored procedure. From the first 3 columns that I am selecting from this function, we can see the Schema name, Object name, and column name fields. The next column is the referenced class description which can either be an OBJECT OR COLUMN, TYPE, XML SCHEMA COLLECTION, or a PARTITION FUNCTION. The next column, is_ambiguous, describes whether or not the referenced object is ambiguous and can resolve to multiple different objects during run-time. The is_updated column describes whether or not this object is updated as part of the stored procedure. Lastly, the is_select_all column refers to whether or not a particular column is included as part of a select * statement.

    This DMV is very helpful when you need to identify all objects that are being referenced as part of another object.

    For more information about this DMV, see the below Books Online link:

    http://msdn.microsoft.com/en-us/library/bb677185.aspx

    Follow me on Twitter @PrimeTimeDBA

    16 Jul 16:49

    The benefit of placement firms

    by James Serra

    Why do companies use placement firms to find IT contractors instead of using internal recruiters?

    Who better to answer that question that a recruiter who has been in the business for many years and now has his own small placement firm (and explains why using a small placement firm has many benefits over the big guys):

    Internal recruiters cost money and are a fixed cost regardless of how they produce.  Staffing firms only get paid when they deliver – basically risk free, especially to those small to mid-size companies that don’t have staffing needs on a regular basis.  Also, staffing firms are much more aggressive for the reason I mentioned – we don’t get paid unless we deliver.  Internal recruiters (generally) aren’t as motivated.  I’ve had a number for clients drop their internal recruiter(s) after seeing how hard I work (sitting in on interviews, prepping candidates, generally saving people time).

    Also the cash flow issue – a staffing firm typically floats the first 6-8 weeks of a candidate’s pay until the firm’s first invoice is due.  This is big for smaller tech companies working on projects.

    They pre-qualify the candidates, saving time for the client; Not having to do the dirty work of getting rid of a person who is not working out (calling the placement company and having them do it); Less chance of getting in trouble with the IRS by having the contractors being considered employees.

    Good staffing firms specialize in certain skill sets and bring decades of experience with solid pipelines.  I love nothing more than when my client tries to staff a position first.  After a week of reading a ton of garbage resumes they give up and tell me to figure it out.  I’ve even had clients come to networking events with me and want to leave after 30 minutes.  It isn’t rocket science but it is hard work.  Wading through the garbage to get to the diamonds in the rough, it takes a lot of time.

    My basic selling points (as the “little guy”) –

    (1) My staffing firm is risk free.  We don’t cost anything unless we deliver

    (2) We are small, so clients have one throat to choke.  We are not a billion dollar giant with tons of bureaucracy (and overhead cost for that matter)

    (3) We are flexible, responsive.  You never work harder than when you work for yourself.  Personally, my name is in the company name for a reason – accountability

    (4) We are easy to work with.  I’m very proud of the fact that not one of my clients have changed a word of my MSA.  It means my contracts are fair/clear/reasonable

    (5) We bring 20 years of IT staffing experience – You have seen the difference between veteran recruiters and rookies: One saves you time, the other wastes it

    (6) I’m frank/honest with what we can deliver and what is over our head (a reference to volume, not skill set – we don’t pretend to be able to deliver 50 Java developers)

    (7) I generally close by asking for a “chance” – no strings attached.  Talk only goes so far, your actions have to speak louder than words

    The first few years of my company’s existence, I was in full-on sales mode.  Now that I have over a dozen clients, I’m more focused on keeping them happy than I am trying to land new clients.  But don’t get me wrong, I’ll talk to anyone on any day about my company.  I also get more referrals now, just having been around for 4 years – I think people realize that you’re going to stick and not be a flash in the pan.  With more positive interactions come more opportunities to be referred.  I literally got two calls last week from people asking if they could introduce me to a new client.  I couldn’t be more proud of that.

    More info:

    A brief note on recruiters and recruitment

    16 Jul 10:02

    Making a better, somewhat prettier, but definitely more functional Windows Command Line

    by Scott Hanselman
    Running htop via ssh under ConEmu
     

    I've blogged before, in fact in 2004, (!) that Windows is missing the text mode boat. There is a massive opportunity for a great, nay, awesome and pretty, command line on Windows. If someone cracks this problem, they're gonna be heroes.

    I love iTerm2 and its tabs, its font handling, its simple elegance. I want this on Windows. In 2011 I found Console2, and then in 2012 I moved to ConEmu, a great tabbed terminal for Windows. Even then, it's not "pretty." I love these guys, and the ConEmu is truly an amazing and configurable piece of software, but it was written by developers for developers. I have to change the fonts to Consolas for the main font and Segoe UI for the rest to make it tolerable. Am I being petty and focusing on looks? Absolutely. Gorgeous and functional software is why Mac companies like Panic exist. They make things that are pretty AND functional. Windows folks could definitely "lovingly design" stuff more.

    Here's some command line utilities that augment and help - but don't yet complete save - the Windows Command Line.

    Clink

    I just learned about Clink and I'm hooked. It's hooked as well, directly into your cmd.exe window! *rimshot*

    We all know that there's Cygwin for a bash-like experience in Windows, but Click is a small utility that brings some of those productivity and editing features into cmd.exe directly!

    • Bash-like line editing from GNU's Readline library. Read more on Readline's keyboard shortcuts.
    • Better path completion (TAB).
    • Paste from clipboard (Ctrl-V). Oh yes.
    • Support for the completion of executables/commands, and environment variables.
    • Undo/Redo (Ctrl-_ or Ctrl-X, Ctrl-U)
    • Improved command line history.
      • Persists across sessions.
      • Searchable (Ctrl-R and Ctrl-S).
      • History expansion (e.g. !!, !, and !$).

    The most significant change that Clink makes is to Tab Completion, moving to a more Bash-y "show them the choices" mode rather than the DOS-like "make them cycle through everything." Here I've pressed TAB over 2013-0 and Clink is showing me what I can choose from.

    using Clink to make cmd.exe better

    PowerShell ISE

    Surprise! You already have this on your Windows computer. Ya, it freaked me out also. You can even hide the script pane if you want (Ctrl-R) and just use PowerShell ISE as a console! You get auto completion (see the Directory intellisense below), coloring, aliases and all the power of PowerShell.

    Sure, it's not bash, but that may be a good thing. You may not have been exposed to PowerShell and the prospect may frighten you, but try it for a bit. They've aliased the obvious commands "ls" does what you'd expect as does "dir." Moving around will feel like any command prompt.

    Not to mention if you are using PowerShell you already get a full debugger experience.

    The PowerShell ISE

    It won't win any awards for good looks (again, I come back to the importance of fonts, whitespace, and good typography...get a designer) but it is extremely functional and you already have it!

    ConEmu lets you put your consoles in JumpLists!ConEmu

    I've talked about ConEmu before, but I'll bring some of that over here. ConEmu takes your command prompt and adds tabs, status bar details, admin tabs, freakin' taskbar progress bars on copies (which is hot), and deep support for FarManager (Norton Commander anyone?)

    Tabs in ConEmu in Windows

    ConEmu is definitely a huge jump for console usability on Windows. The feature that really blew me away was Progress Bar integration. If you're familiar with Windows 7 you are likely familiar with the way that progress bars are overlaid over a Windows 7 Taskbar button. ConEmu looks at the current application running and some heuristics and overlays progress. Madness. Do a chkdsk and watch the progress bar. Love this little detail.

    ConEmu Progress Bar

    Git for Windows or Cygwin

    If you want a Linux-like experience on Windows with a nice shell, Cygwin has long been a choice. However, since the release of Git for Windows most folks I know just install it and use the Git Bash. If you get Cygwin proper you'll get a much more complete "fake Linux" through their very competent set of command line tools, but for most, Git Bash will suffice.

    Git Bash gets you close

    What about SSHing? That's a fundamental part of command-line life for folks connecting to remote Unix machines. For me, I have a Linux farm I run on Azure that I often need to ssh into.

    Random: I like to say I 'shoosh' into the machines, but folks keep looking at me weird. I thought this was a thing?

    However, the SSH clients for Windows suck. Ok, they don't suck, but they are ugly. It's scandalous how ugly. Mad respect to PuTTY for being awesome and super functional, but it's like running Windows 95 in a window every time I launch it. Here's some better SSH clients, including a fork of PuTTY itself.

    Bitvise SSH Client - more importantly, SSH from the command line

    The Bitvise SSH Client is free for personal use and works great. There's a whole GUI, and, bless them, it's not pretty. However! There's also a command line version which is the REAL treasure. I just want to type ssh and be on my way.

    In fact, I made a batch file called "ssh.bat" and put it in my PATH that just has this inside: "stermc %1" this means I can just type ssh user@hostname:port and be on my way. This is, for me, WAY easier than putty for most things. Bitcise is definitely worth checking out.

    Find a SSH command line too like Bitvise

    Kitty

    Kitty is a fork of version 0.62 of the original PuTTY. There's also a portable version that I've put in my Dropbox utils folder (which is in my PATH) so it's on every machine I have automatically. Kitty has some nice features like Send to Tray, transparency, session launching (so you don't need Pageant), and lots of little poweruser features like "rolling up" the app if you Ctrl-Click on the Title Bar.

    Kitty is a better PuTTY alternative

    Kitty also can integrate into your browser to handle ssh:// links, which is a nice touch.

    What console app improvers have I missed? What do you use on Windows? Sound off in the comments.


    Sponsor: Big thanks to the folks at RedGate for sponsoring the feed this week. Take a moment and check out their free download of Deployment Manager! Easy release management: Deploy your .NET apps, services and SQL Server databases in a single, repeatable process with Red Gate’s Deployment Manager. There’s a free Starter edition, so get started now!



    © 2013 Scott Hanselman. All rights reserved.
         
    16 Jul 00:30

    Why You Shouldn’t Disable The Task Scheduler Service in Windows 7 and Windows 8

    by Mark Morowczynski [MSFT]

    Hello, Jeff “The Dude” Stokes here for an installment on a very important topic.  Why should I not disable the task scheduler in Windows?

    Long, long ago in the annals of IT history, the Task Scheduler was a poorly understood component of Windows.  “What does it do?” We’d wonder…

    Fast forward to today and now, the Task Scheduler is still a poorly understood component of Windows.  “What does it do and why can’t I disable it to be secure?” We ask…

    We have heard about some changes in Vista and Windows 7 regarding the task scheduler, but really, why not disable the dang thing to be more secure or increase system performance?

    Because disabling the task scheduler does not make your system more secure, nor does it increase system performance.  In fact, it makes your system less secure in Windows 8, and in Windows 7 and 8 makes performance worse, especially over time.

    In Windows 7 the Task Scheduler is responsible for background health and cleaning processes such as optimizing prefetch and readyboot for instance.  It also handles light defragmentation runs on the system.

    In Windows 8, it’s even more important. It optimizes the start menu…

    Pic1

     

     What else?  File History is task scheduler based.

    image

     

    Bluetooth device cleanup (when you unpair a device)

    image

     

    Cleaning up Application Temporary Files as well

    image

     

    How about making sure the file system is healthy?  Yeah that’s a task, too.

    image

     

    Run RAID sets on your machine?  You’ll want task scheduler.

    image

     

    How about Windows Updates?

    image

     

    So let’s leave the Task Scheduler Service alone in our quest for security hardening and go pick on more interesting things like Anti-Virus and Data Loss Prevention kits. 

    So remember, Relax, don’t do it. Don’t disable the task scheduler!

     For more information on the Task Scheduler see below:

    Task Scheduler Changes in Windows Vista and Windows Server 2008 – Part One
    http://blogs.technet.com/b/askperf/archive/2008/06/24/task-scheduler-changes-in-windows-vista-and-windows-server-2008-part-one.aspx

    Task Scheduler Changes in Windows Vista and Windows Server 2008 – Part Two
    http://blogs.technet.com/b/askperf/archive/2008/10/10/task-scheduler-changes-in-windows-vista-and-windows-server-2008-part-two.aspx

    Task Scheduler Changes in Windows Vista and Windows Server 2008 – Part Three
    http://blogs.technet.com/b/askperf/archive/2009/03/17/task-scheduler-changes-in-windows-vista-and-windows-server-2008-part-three.aspx

    Two Minute Drill - Quickly test Task Scheduler
    http://blogs.technet.com/b/askperf/archive/2011/06/10/two-minute-drill-quickly-test-task-scheduler.aspx

    What’s New in Task Scheduler for Windows 8 & Server 2012
    http://blogs.technet.com/b/askperf/archive/2013/07/05/what-s-new-in-task-scheduler-for-windows-8-amp-server-2012.aspx

     

    Jeff “The Dude” Stokes

    15 Jul 19:31

    My Advice About Growing as an IT and Data Professional

    by kevin

    IMG_6089First off, I should apologize for not blogging in quite a while. I’ve been traveling extensively and, because traveling can be so exhausting, I usually have to queue up blog posts to hit while I’m traveling in order to remain active while on the road.  I didn’t do that for these most recent trips.  And, as you can see, there’s quite a gap between this post and my last few posts.

    On the other hand, I’ve experienced an unexpected surge in requests for mentoring and advice from friends and colleagues about career growth.  I’m always happy to help out a friend and, in fact, there’s little in my professional experience which I enjoy more than seeing friends grow and advance.  Because I get a lot of questions about professional development, I try to distill these thoughts down into usable nuggets of wisdom. Since I’ve had quite a few interview requests in the vein of career and professional development in the last several weeks, I thought what better way to get back on track with blogging than to put them all together in one place!

    Infusive Solutions: Mindful Career Development

    Ben Weiss, the digital marketing strategist at Infusive Solutions in NYC (facebook | blog | twitter),  reached out to me a few weeks ago to discuss career development for his own team members around personal branding and career growth.  This guy is really good at making hay while the sun is shining!  We’d initially done just a simple phone discussion, but he has managed to expand these discussions in a variety of ways.  Here are all the links, at present, to the content that Ben created from our discussions:

    1. How to Become an IT Executive: http://www.infusivesolutions.com/blog/bid/99470/How-to-Become-an-IT-Executive-Especially-if-You-re-a-DBA

    2. Landing page with the full interview: http://www.infusivesolutions.com/a-sql-server-mvp-with-knowledge-to-share

    3. The recent NYC SQL User Group meeting  on personal branding: https://clicktoattend.microsoft.com/en-us/Pages/EventDetails.aspx?EventId=170863

    One of the key points in the interviews is that the IT professionals who learn the most about what their business does, rather than just the IT that they work with, are frequently the most successful.  It’s a mindful process of understanding your strengths and weaknesses, improving those areas where you are genuinely weak, amplifying those areas where you excel, develop your communication skills, and then getting outside of your comfort zone to become a genuine business problem solver.  I’ve said it 100 times - There are NO information technology problems. There are business problems which are solved with information technology.  The IT pros who learn that lesson are mighty indeed.  Grab the interviews for more insight.

    Louis Davidson: Why We Write

    Louis “DrSQL” Davidson (blog | twitter) is a long-time good friend. We live in the same fantastic town of Nashville, TN and both are active supporting the local SQL Server user group.  In one of those ironic twists of 21st-century life, we might see each other more in a given year in other cities than in our own home town, simply because we both speak and travel a lot.  If you’re considering growing your professional credentials by writing, and it certainly is one of the best ways to grow your personal brand, then I recommend you read all of the “Why We Write” posts in Louis’ blog series.  In my interview, I tried to give Louis really thoughtful and insightful answers.  The core of my advice to potential writers in the answer to question number 5.  Please read it and tell me what you think.  The full blog post and interview:

    Why We Write, #6 by Louis Davidson with Kevin Kline: http://sqlblog.com/blogs/louis_davidson/archive/2013/06/05/why-we-write-6-an-interview-with-kevin-kline.aspx

    Richard Douglas: Speaking and Presentation Skills

    My friend and former coworker at Quest Software, Richard Douglas, has also put together a very nice interview series which is predominantly IT experts from the UK.  Richard has some very insightful questions and a rather different strategy.  His questions are much more focused on developing skills as a good speaker.  It’s been very gratifying to see Richard grow in stature as a database expert over in the UK and I’m looking forward to even more great community work from him.  An example of the interesting sort of questions he came up with for the interview include how to make a presentation not only useful, but how can the speaker project it with authority and charisma?  That’s not the sort of question I get every day.

    Read my answer to this question and many more here: http://sql.richarddouglas.co.uk/archive/2013/01/new-years-aspirations-kevin-kline-presenting.html#ixzz2MtjkVRb6.

    Tim Ford: Interpersonal and Communication Skills

    Tim Ford (blog |twitter) is a long-time SQL Server pro, who’s also a very energetic volunteer and user group leader.  One of his very cool side projects is the SQL Cruise.  Tim has been running the SQL Cruise for many years now and it’s one of my favorite ways to conduct training.  We have a full day of training every day that we’re at sea.  We do excursions together.  We do “office hours” every evening with lots of time to discuss individual problems, career development, and focus on individual mentoring.  A while back, Tim conducted an audio interview and, for some reason, I’m only now getting around to post it.

    Please give it a listen and let me know what you think: http://db.tt/UHJK5ojc. (This is a Dropbox file. I’m not really sure how it’ll behave if you do not have Dropbox installed. Let me know if it doesn’t work as expected).

    Rodney Landrum and SQLBeats: Looking Over the Horizon

    Rodney Landrum (Twitter |Blog) has been putting out great content for Simple-Talk for quite a long time now.  In fall of 2012, we got together and recorded a podcast which was really fun and, at times, funny. Here is the full podcast. Rodney told me that he laughed at several points while editing.  That makes me happy!

     

    Most of all, I’d like to hear your feedback.  Let me know what you think by posting a comment here, mentioning this on Twitter, or social media like Facebook or LinkedIn.  Many thanks,

    -Kevin

    -Follow me on Twitter!
    -Google Author

     

    15 Jul 15:27

    July, the 31 Days of SQL Server DMO’s – Day 15 (sys.dm_os_volume_stats)

    by Tamarick Hill

     

    The sys.dm_os_volume_stats Dynamic Management Function is used to return information about disk volumes that contain your database files. This is a function so you have to pass in a database_id and a file_id. To better illustrate this function, I will run the below query on my TRAINING instance.

    SELECT db_name(vs.database_id) AS DatabaseName, vs.file_id, vs.volume_mount_point, vs.volume_id, vs.logical_volume_name,
           vs.file_system_type, (vs.total_bytes/1024/1024/1024) AS [TotalSize(GB)], (vs.available_bytes/1024/1024/1024) AS [AvailableSize(GB)],
           vs.supports_compression, vs.supports_alternate_streams, vs.supports_sparse_files, vs.is_read_only, vs.is_compressed
    FROM sys.master_files mf
    CROSS APPLY sys.dm_os_volume_stats(mf.database_id, mf.file_id) vs
    WHERE db_name(vs.database_id) like 'Adventure%'

     

    image

    image

    In the query above, I used the information from sys.master_files and then used the CROSS APPLY with my sys.dm_os_volume_stats function to return the shown result sets. In addition to this, I filtered the results so we only se information for the ‘AdventureWorks2012’ and ‘AdventureWorksDW2012’ databases.

    The first column returned by this function displays the Database Name. The second column displays the file id which in my case is the data file and transaction log file. The volume_mount_point column represents the the mount point that the volume is rooted to. Next we have the id for the specific volume and a name for the volume. The file_system_type column signifies whether you are using NTFS, FAT, RAW, etc system file types. The next columns represent the size of my volume and how much of that volume is still available for use. The next three columns of this function are self explanatory for the most part, signifying whether or not compression, alternate streams, or sparse files are supported. Lastly, we have have columns displaying whether or not the volume is read only or if the volume is compressed.

    Many DBA’s tend to rely on the extended stored procedure xp_fixeddrives when needing to see free size information about the various drives on a particular server. The problem with disk drive letters is that you are limited by the number of letters in the alphabet. If your database servers use mount points instead of drive letters, then xp_fixeddrives becomes useless to you at that point. This is when I find a Dynamic Management Function like this one to be extremely useful. There are other ways to get mount point information such as using powershell and WMI, but for a SQL Server solution, this DMF provides one of the best solutions available.

    For more information about this Dynamic Management Function, see the below Books Online link:

    http://msdn.microsoft.com/en-us/library/hh223223.aspx

    Follow me on Twitter @PrimeTimeDBA

    15 Jul 13:01

    Transaction dependencies and speculative reads with memory-optimized tables

    In reading the whitepaper about “High-Performance Concurrency Control Mechanisms for Main-Memory Databases”, I was intrigued by the discussion of speculative reads and transaction dependencies. It’s not always good to use information from an academic whitepaper as details of an implementation, because the real implementation might be differ slightly from the description of the whitepaper’s implementation. In addition, bear in the that the observations here are based on the CTP1 version of SQL Server 2014, and the details may change by RTM.

    The whitepaper describes two implementations of the multiversion storage engine, one using optimistic concurrency with no locking and one using locking. Because the description of the Hekaton feature mentioned “lock-free structures” as one of the pillars, I looked at the optimistic implementation as possibly close to the CTP1 implementation.

    To paraphrase the whitepaper, there are at least two ways the speculative reads can take place.
    1. Transaction A determines if it can read Version V. Version V’s begin timestamp is another transaction’s (TB) ID. TB is in the preparing state. Use transaction B’s end timestamp (obtained right before TB-prepare) as V’s begin time. Speculatively read V. Transaction A has a commit dependency of Transaction B.
    2. Transaction A determines if it can read Version V.  Version V’s end timestamp is another transaction’s (TE) ID. TE is in the preparing state. Use transaction E’s end timestamp (obtained right before TE-prepare) as V’s end time. Speculatively ignore V. Transaction A has a commit dependency of Transaction E.

    These descriptions mention dependencies happening in the transaction A’s Active phase, but in addition, there is mention that “Transaction A may acquire additional commit dependencies during validation but only if it speculatively ignores a version.” For information about transaction phases, reference the previous blog entry or the whitepaper.

    After noticing the extended events dependency_acquiredtx_event and waiting_for_dependenciestx_event, I set out to look for those dependencies. Because tx A can only acquire a dependency on tx B when B is in the preparing phase (state) and, in most cases, the preparing state is usually pretty short, the dependency sounded almost like a race condition. Making the preparing state as long as possible would give me a better chance.

    In the implementation of pessimistic concurrency, the whitepaper mentions two “read sets” that are checked during the prepare phase. The ReadSet contains pointers to every version read and the ScanSet stores information needed to repeat every scan. The whitepaper also describes WriteSet but that’s outside my scope. During the prepare phase ReadSets are checked to ensure consistency in isolation level Repeatable Read or Serializable. In addition, ScanSets are checked to guard against phantoms in isolation level serializable. Serializable isolation with a large ScanSet seemed to be the best choice to lengthen the prepare phase.

    I declared a two column memory-optimized table with a primary key/hash index on one column (I called it id) and no index at all on the other column (c1). Added 250000 rows. And figured that running the following batch inside a “traditional transaction” (to slow things down even more compared to a compiled stored procedure) from multiple clients should produce the behavior.

    begin tran
    select top(150000) * from txtest with (serializable);
    update txtest with (serializable)  set c1 = 1
    where c1 in (select top(10) c1 from txtest with (serializable) order by newid()); — update 10 random rows
    commit

    Running betwen 5-10 of these clients simultaneously for 20 iterations of each client produced the “transaction dependency” behavior consistently. Tracing this with an extended events session that included:
    dependency_acquiredtx_event
    waiting_for_dependenciestx_event
    before_changestatetx_event
    after_changestatetx_event

    Besides observing the behavior, I was able to make some interesting observations from the event session.
    1. You can have “chains” of dependent transactions, e.g. tx A depends on tx B, tx B depends on tx C, etc.
    2. You can have multiple dependent transactions on the same transaction, e.g. txs A, B, and C all depend on tx D.
    3. You can have multiple dependency_acquiredtx_event for the same two transactions, e.g. two different occurrences of the event for the dependency tx A depends on tx B.

    I also noticed one other interesting behavior. In my tests with 10 clients x 20 iterations (200 transactions total) between 60% and 80% of the transactions routinely abort. That’s not surprising, the test was set up to produce read conflicts. What was surprising is that, although transaction commit wasn’t the norm, every transaction that had dependency(ies) on it ended up committing. 22 of 200 transactions in one test. And the dependent ones also committed. That’s surprising, but perhaps the fact that transactions that have dependencies and those that they depend on all eventually commit is just an artifact of the test batch I chose.

    So, from this test you can deduce that:
    1. Transaction dependencies and speculative reads are real (can be observed) in the current implementation.
    2. Large scans and transaction isolation levels higher than snapshot should only be used with memory-optimized tables when *absolutely* necessary. 3. That’s especially true for transaction isolation level serializable.

    Cheers, Bob

    The post Transaction dependencies and speculative reads with memory-optimized tables appeared first on Bob Beauchemin.

    15 Jul 12:59

    The Accounts were Pounded

    by Remy Porter

    Shortly before the global banking system chucked itself out of a 30 story window, one large bank purchased another. It may be a common financial transaction, but from an IT standpoint, the problem of integrating these two systems is always unique, always cumbersome, and fraught with problems. Sajid was one of the lucky individuals tasked with the great undertaking of moving these banks to one information system.

    Sajid’s business contact at the new acquisition was Karel. The latest round of requirements from Karel described a system to track the movement of trade data from his office, in the Netherlands, to Sajid’s, in London. The underlying business logic was simple, but the nature of financial data added layers of complexity. Moving data around was insufficient- there needed to be auditing, tracing, regulatory compliance. The project needed a risk management plan, the tool itself needed a clearly documented backup and recovery plan, and on top of all of that there was the simple management overhead of doing anything in a large organization.

    Sajid’s finished estimate for the project contained a large number of digits. Karel blanched at the expense, thanked Sajid for his time, and scurried back to Rotterdam. Sajid assumed that he’d receive a budget in a few days.

    Instead, everything was silent for months. The original deadline for the migration project slipped past, and a new project called “Tail Migration” began. When that deadline became a historical relic, the “Post-Migration” began, and mutated into the “Post-Tail-Abaft-Ongoing-Terminal Migration”. The ever-changing nature of integrating two systems nobody understood kept Sajid busy.

    Then £14M vanished. Someplace over the North Sea, the data ceased to exist. Given the number of integration packages which handled the records, and given how many of them Sajid was responsible for, the missing money became Sajid’s very large problem.

    Sajid started by searching the mainframes for the transaction data. He moved over to the MQ logs, the BizTalk audit trails, and the couch cushions. All he could find was a fourteen-million-pound-shaped-hole in their records. Suggesting that it might be a rounding error, or a missed decimal point didn’t make his management happy.

    Sajid called Karel. “Oh, yes,” Karel said. “We did have an issue with the Bankieren Integratie Tool. I had to delete the code and rebuild it. Perhaps some transactions were lost?”

    “The what? Wait- wasn’t that the estimate request I did? We never built that for you.”

    “Well, you wanted too much money. We only need the system until migration is complete. We found an outside contractor who worked more cheaply. I’ll send you a link to the network share on my computer.”

    Sajid’s mouse-hand twitched with terror and rage. He opened the network share and saw what he feared he would see- 8GB of data split across 6 Access databases. It was clearly “backed up”: a sub-folder on the same share called “Reservekopie ” contained older versions of all the files.

    Sajid poked at the files like they were made of pure nitroglycerine and might explode if he sneeze. Despite being a mass of VBA, clumsy SQL queries, and being mostly written in Dutch, Sajid still recognized some of the functionality that he had estimated months prior. Like an old hard candy, forgotten under the couch, it had gathered layers of dust, dirt, and garbage. New features and reports had been added. None of it was audited. None of it was tracked.

    There were nearly a billion pounds worth of transactions sitting in that database. Sajid had no idea if any lived in some other system, or if this were their only record.

    Sajid copied the application and picked apart the database like a coroner examining a murder victim. Most of the data resided in split files- Access capped at 2GB of data per file. All of the code resided in one file called “DeCode.accdb”. In addition to holding code, it held scratch tables for some of its work.

    It was impossible to be certain, but Sajid’s best guess was that some bug in the underlying VBA code threw errors, and convinced Karel that the database was “corrupt”. Karel deleted “DeCode” and replaced it with his “backup”. The £14M Sajid needed to find caused the error, and had been deleted along with the scratch table.

    The data was gone, and with it, the money. There was no IT solution here, and Sajid needed to turn this over to accounting. They needed to perform an audit to rebuild the transaction data. Banking regulators would need to get involved. Honestly afraid of the coming fallout, Sajid went to his boss to deliver the bad news.

    Little did he know that his boss had bigger problems- the person responsible for running their close-of-business batch jobs had accidentially deleted the entire process queue. Their bank was effectively closed until the very finicky mainframe scheduling system could be reconfigured. Suddenly, losing £14M wasn’t much of a problem.

    Image via Wikimedia Commons. Source.

    [Advertisement] Make your team a DevOps team with BuildMaster. Pairing an easy-to-use web UI with a free base platform, BuildMaster gets you started in minutes. See how Allrecipes.com and others use BuildMaster to automate their software delivery.