15 Aug 10:56

Personally Identifiable Information (PII) and Data Encryption

Hitting close to home, SC Governor Nikki Haley noted that after the SC Department of Revenue breach was reported, that the IRS didn't require the data to be encrypted:

“As I am sure you are aware, an international hacker recently breached the South Carolina Department of Revenue’s computer system exposing the personal information of all electronic tax filers in my state,” she wrote to IRS Acting Commissioner Steven T. Miller. “While this incident was entirely caused by a malicious criminal hacker, the investigation of how this breach occurred has unfortunately revealed that the IRS does not require encryption of stored tax data, only transmitted data.”

The IRS tried to refute this, saying they "had a long list of requirements." It turned out she was correct, and you can see the evidence in IRS Publication 1075, which states (for data at rest):

While encryption of data at rest is an effective defense-in-depth technique, encryption is not currently required for FTI while it resides on a system (e.g., in files or in a database) that is dedicated to receiving, processing, storing or transmitting FTI, is configured in accordance with the IRS Safeguards Computer Security Evaluation Matrix (SCSEM) recommendations and is physically secure restricted area behind two locked barriers. This type of encryption is being evaluated by the IRS as a potential policy update in the next revision of the Publication 1075.

Then we see that the Department of Homeland Security had a data breach, and once again, PII data was taken. That PII data was unencrypted.

As a result of this vulnerability, information including name, Social Security numbers (SSN) and date of birth (DOB), stored in the vendor’s database of background investigations was potentially accessible by an unauthorized user.

This begs the question, "When are we going to get serious about encrypting personal information?" I know there are challenges to doing so, especially for older systems. Also, when it comes to SQL Server, not everyone can afford Enterprise Edition licenses where you get Transparent Data Encryption starting in SQL Server 2008. I also know there's concern because encrypted files don't compress well, so when you take a backup and it's encrypted and then try to pass it over to a system that dedupes data and attempts to compress, you don't get such good results. However, at some point we've got to accept that this is part of the cost of doing business and look to do a better job of encrypting data.

With respect to SQL Server we have options. Among them:

Built-in encryption within the database (at the "field" level)
Transparent Data Encryption at the database level (which also ensures native backups are encrypted)
Third party backup products that support encryption
Third party encryption products that offer similar capabilities to TDE

It's a matter of identifying what needs to be encrypted, the proper solution, and accepting the cost. To say it's not required, the excuse used by Haley, is just that, an excuse. As IT professionals we should press for the right solution. I realize this is ultimately a business proposition. However, if we're not bringing up the discussion, we're part of the problem.

11 Jun 22:16

Spooling in SQL execution plans

by Rob Farley

Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use.

That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful.

If you have a look at the Books Online page about Plan Operators, at http://msdn.microsoft.com/en-us/library/ms191158.aspx, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too...

Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here).

Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment.

In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data.

I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool!

Many of my favourite features in T-SQL do like to use spools though.

This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time.

So what does a Spool operator actually do?

The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top.

A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy).

Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows.

And when is it needed?

The way I see it, spools are needed for two reasons.

1 – When data is going to be needed AGAIN.

2 – When data needs to be kept away from the original source.

If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection... An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it.

In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done.

There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query.

Table v Index

When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile.

It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure.

Recursive CTE

I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here.

With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t.

On the other hand, if we don’t pull as many columns back, we might see a very different plan.

You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table).

When recursion is used, though, spooling should be expected.

Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source.

(This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012)

The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like).

Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query.

I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it.

@rob_farley

11 Jun 22:15

Hello Operator, My Switch Is Bored

by Paul White

This is a post for T-SQL Tuesday #43 hosted by my good friend Rob Farley. The topic this month is Plan Operators. I haven’t taken part in T-SQL Tuesday before, but I do like to write about execution plans, so this seemed like a good time to start.

This post is in two parts. The first part is primarily an excuse to use a pretty bad play on words in the title of this blog post (if you’re too young to know what a telephone operator or a switchboard is, I hate you). The second part of the post looks at an invisible query plan operator (so to speak).

1. My Switch Is Bored

Allow me to present the rare and interesting execution plan operator, Switch:

Switch Showplan Operator

Books Online has this to say about Switch:

BOL description

Following that description, I had a go at producing a Fast Forward Cursor plan that used the TOP operator, but had no luck. That may be due to my lack of skill with cursors, I’m not too sure.

The only application of Switch in SQL Server 2012 that I am familiar with requires a local partitioned view:

CREATE TABLE dbo.T1 (c1 int NOT NULL CHECK (c1 BETWEEN 00 AND 24));

CREATE TABLE dbo.T2 (c1 int NOT NULL CHECK (c1 BETWEEN 25 AND 49));

CREATE TABLE dbo.T3 (c1 int NOT NULL CHECK (c1 BETWEEN 50 AND 74));

CREATE TABLE dbo.T4 (c1 int NOT NULL CHECK (c1 BETWEEN 75 AND 99));

GO

CREATE VIEW V1 AS

SELECT c1 FROM dbo.T1

UNION ALL

SELECT c1 FROM dbo.T2

UNION ALL

SELECT c1 FROM dbo.T3

UNION ALL

SELECT c1 FROM dbo.T4;

Not only that, but it needs an updatable local partitioned view. We’ll need some primary keys to meet that requirement:

ALTER TABLE dbo.T1

ADD CONSTRAINT PK_T1

PRIMARY KEY (c1);

ALTER TABLE dbo.T2

ADD CONSTRAINT PK_T2

PRIMARY KEY (c1);

ALTER TABLE dbo.T3

ADD CONSTRAINT PK_T3

PRIMARY KEY (c1);

ALTER TABLE dbo.T4

ADD CONSTRAINT PK_T4

PRIMARY KEY (c1);

We also need an INSERT statement that references the view. Even more specifically, to see a Switch operator, we need to perform a single-row insert (multi-row inserts use a different plan shape):

INSERT dbo.V1 (c1)

VALUES (1);

And now…the execution plan:

Switch Execution Plan

The Constant Scan manufactures a single row with no columns. The Compute Scalar works out which partition of the view the new value should go in. The Assert checks that the computed partition number is not null (if it is, an error is returned). The Nested Loops Join executes exactly once, with the partition id as an outer reference (correlated parameter).

The Switch operator checks the value of the parameter and executes the corresponding input only. If the partition id is 0, the uppermost Clustered Index Insert is executed, adding a row to table T1. If the partition id is 1, the next lower Clustered Index Insert is executed, adding a row to table T2…and so on.

In case you were wondering, here’s a query and execution plan for a multi-row insert to the view:

INSERT dbo.V1 (c1)

VALUES (1), (2);

Yuck! An Eager Table Spool and four Filters! I prefer the Switch plan.

My guess is that almost all the old strategies that used a Switch operator have been replaced over time, using things like a regular Concatenation Union All combined with Start-Up Filters on its inputs. Other new (relative to the Switch operator) features like table partitioning have specific execution plan support that doesn’t need the Switch operator either. This feels like a bit of a shame, but perhaps it is just nostalgia on my part, it’s hard to know.

Please do let me know if you encounter a query that can still use the Switch operator in 2012 – it must be very bored if this is the only possible modern usage!

2. Invisible Plan Operators

The second part of this post uses an example based on a question Dave Ballantyne asked using the SQL Sentry Plan Explorer plan upload facility. If you haven’t tried that yet, make sure you’re on the latest version of the (free) Plan Explorer software, and then click the Post to SQLPerformance.com button. That will create a site question with the query plan attached (which can be anonymized if the plan contains sensitive information). Aaron Bertrand and I keep a close eye on questions there, so if you have ever wanted to ask a query plan question of either of us, that’s a good way to do it.

The problem

The issue I want to talk about revolves around a query issued against a calendar table. The script below creates a simplified version and adds 100 years of per-day information to it:

USE tempdb;

GO

CREATE TABLE dbo.Calendar

    dt          date     NOT NULL,

    isWeekday   bit      NOT NULL,

    theYear     smallint NOT NULL,

    CONSTRAINT PK__dbo_Calendar_dt

        PRIMARY KEY CLUSTERED (dt)

);

GO

-- Monday is the first day of the week for me

SET DATEFIRST 1;

-- Add 100 years of data

INSERT dbo.Calendar WITH (TABLOCKX)

    (dt, isWeekday, theYear)

SELECT

    CA.dt,

    isWeekday =

        CASE

            WHEN DATEPART(WEEKDAY, CA.dt) IN (6, 7)

            THEN 0

            ELSE 1

        END,

    theYear = YEAR(CA.dt)

FROM Sandpit.dbo.Numbers AS N

CROSS APPLY

    VALUES (DATEADD(DAY, N.n - 1, CONVERT(date, '01 Jan 2000', 113)))

) AS CA (dt)

WHERE

    N.n BETWEEN 1 AND 36525;

The following query counts the number of weekend days in 2013:

SELECT

    Days = COUNT_BIG(*)

FROM dbo.Calendar AS C

WHERE

    theYear = 2013

    AND isWeekday = 0;

It returns the correct result (104) using the following execution plan:

Original Plan

The query optimizer has managed to estimate the number of rows returned from the table exactly, based purely on the default statistics created separately on the two columns referenced in the query’s WHERE clause. (Well, almost exactly, the unrounded estimate is 104.289 rows.)

There is already an invisible operator in this query plan – a Filter operator used to apply the WHERE clause predicates. We can see it by re-running the query with the enormously useful (but undocumented) trace flag 9130 enabled:

TF 9130 plan

Now we can see the full picture. The whole table is scanned, returning all 36,525 rows, before the Filter narrows that down to just the 104 we want. Without the trace flag, the Filter is incorporated in the Clustered Index Scan as a residual predicate. It is a little bit more efficient than using a separate operator, but residual predicates are still something you will want to avoid where possible. The estimates are still spot on though:

Plan Explorer Actual vs Estimate

Anyway, looking to improve the performance of this query, Dave added the following filtered index to the Calendar table:

CREATE NONCLUSTERED INDEX Weekends

ON dbo.Calendar(theYear)

WHERE isWeekday = 0;

The original query now produces a much more efficient plan:

Filtered index plan

Unfortunately, the estimated number of rows produced by the seek is now wrong (365 instead of 104):

Wrong Estimate!

What’s going on? The estimate was spot on before we added the index!

Explanation

You might want to grab a coffee for this bit.

Using another trace flag or two (8606 and 8612) we can see that the cardinality estimates were exactly right initially:

8606 and 8612

The highlighted information shows the initial cardinality estimates for the base table (36,525 rows), the result of applying the two relational selects in our WHERE clause (104 rows), and after performing the COUNT_BIG(*) group by aggregate (1 row). All of these are correct, but that was before cost-based optimization got involved :)

Cost-based optimization

When cost-based optimization starts up, the logical tree above is copied into a structure (the ‘memo’) that has one group per logical operation (roughly speaking). The logical read of the base table (LogOp_Get) ends up in group 7; the two predicates (LogOp_Select) end up in group 8 (with the details of the selections in subgroups 0-6). These two groups still have the correct cardinalities as trace flag 8608 output (initial memo contents) shows:

8608

During cost-based optimization, a rule called SelToIdxStrategy runs on group 8. It’s job is to match logical selections to indexable expressions (SARGs). It successfully matches the selections (theYear = 2013, is Weekday = 0) to the filtered index, and writes a new alternative into the memo structure. The new alternative is entered into group 8 as option 1 (option 0 was the original LogOp_Select):

Rule result

The new alternative is to do nothing (PhyOp_NOP = no operation), but to instead follow the new logical instructions listed below the NOP.

The LogOp_GetIdx (full read of an index) goes into group 21, and the LogOp_SelectIdx (selection on an index) is placed in group 22, operating on the result of group 21. The definition of the comparison ‘the Year = 2013’ (ScaOp_Comp downwards) was already present in the memo starting at group 2, so no new memo groups are created for that.

New Cardinality Estimates

The new memo groups require two new cardinality estimates to be derived. First, LogOp_Idx (full read of the index) gets a predicted cardinality of 10,436. This number comes from the filtered index statistics:

DBCC SHOW_STATISTICS (Calendar, Weekends)

WITH STAT_HEADER;

Filtered Index Header

The second new cardinality derivation is for the LogOp_SelectIdx applying the predicate (theYear = 2013). To get a number for this, the cardinality estimator uses statistics for the column ‘theYear’, producing an estimate of 365 rows (there are 365 days in 2013!):

DBCC SHOW_STATISTICS (Calendar, theYear)

WITH HISTOGRAM;

365 row estimate

This is where the mistake happens. Cardinality estimation should have used the filtered index statistics here, to get an estimate of 104 rows:

DBCC SHOW_STATISTICS (Calendar, Weekends)

WITH HISTOGRAM;

Filtered stats histogram

Unfortunately, the logic has lost sight of the link between the read of the filtered index (LogOp_GetIdx) in group 22, and the selection on that index (LogOp_SelectIdx) that it is deriving a cardinality estimate for, in group 21. The correct cardinality estimate (104 rows) is still present in the memo, attached to group 8, but that group now has a PhyOp_NOP implementation.

Skipping over the rest of cost-based optimization (in a belated attempt at brevity) we can see the optimizer’s final output using trace flag 8607:

8607

This output shows the (incorrect, but understandable) 365 row estimate for the index range operation, and the correct 104 estimate still attached to its PhyOp_NOP. This tree still has to go through a few post-optimizer rewrites and ‘copy out’ from the memo structure into a tree suitable for the execution engine. One step in this process removes PhyOp_NOP, discarding its 104-row cardinality estimate as it does so.

To finish this section on a more positive note, consider what happens if we add an OVER clause to the query aggregate. This isn’t intended to be a ‘fix’ of any sort, I just want to show you that the 104 estimate can survive and be used if later cardinality estimation needs it:

SELECT

    Days = COUNT_BIG(*) OVER ()

FROM dbo.Calendar AS C

WHERE

    theYear = 2013

    AND isWeekday = 0;

The estimated execution plan is:

OVER plan

Note the 365 estimate at the Index Seek, but the 104 lives again at the Segment! We can imagine the lost predicate ‘isWeekday = 0’ as sitting between the seek and the segment in an invisible Filter operator that drops the estimate from 365 to 104.

Even though the NOP group is removed after optimization (so we don’t see it in the execution plan) bear in mind that all cost-based choices were made with the 104-row memo group present, so although things look a bit odd, it shouldn’t affect the optimizer’s plan selection. I should also mention that we can work around the estimation issue by including the index’s filtering columns in the index key:

CREATE NONCLUSTERED INDEX Weekends

ON dbo.Calendar(theYear, isWeekday)

WHERE isWeekday = 0

WITH (DROP_EXISTING = ON);

There are some downsides to doing this, including that changes to the isWeekday column may now require Halloween Protection, but that is unlikely to be a big problem for a static calendar table ;) With the updated index in place, the original query produces an execution plan with the correct cardinality estimation showing at the Index Seek:

Fixed plan

That’s all for today, remember to let me know about any Switch plans you come across on a modern instance of SQL Server!

Finally, here are some other posts of mine that cover other plan operators:

Segment and Sequence Project

Common Subexpression Spools

Why Plan Operators Run Backwards

Row Goals and the Top Operator

Hash Match Flow Distinct

Top N Sort

Index Spools and Page Splits

Singleton and Range Seeks

Bitmaps

Hash Join Performance

Compute Scalar

Twitter: @SQL_Kiwi

11 Jun 22:15

T-SQL Tuesday #43 – Plan Operators

by Wayne Sheffield

This month’s TSQL-Tuesday is hosted by Rob Farley (blog | twitter), who became a Microsoft Certified Master this past December (congrats!). As this month’s host, Rob has selected the topic of this month’s world-wide T-SQL Tuesday blogging party to be about Plan Operators. Specifically, one or more Plan Operators that we’ve looked into. With this guideline in mind, I’ve decided to write about:

Performance affecting plan operators

(Table-Valued Function)

One of the operators that I look for in execution plans is the Table-Valued Function (TVF) operator. This operator represents a T-SQL Multi-Statement TVF or a CLR TVF. In either case, when the function is called, the results are stored in tempdb, and then returned from tempdb to the query. So, right off, we are incurring IO operations (definitely logical, probably physical also) in tempdb.

However, this isn’t why I go looking for this operator. When I see this operator in an execution plan, I know that I’m not seeing the entire picture. With this operator, all we know from looking at the execution plan is that “something” is going on, but we don’t know what that “something” is or what its true cost is. (I did a deeper dive into this in a previous blog post: Comparing Inline and Multi-Statement Table-Valued Functions.) What gets me the most is that SQL “sees” these operators with an absurdly low cost (normally 0%). Since they are based on Table Variables, SQL also “sees” an estimated number of rows of just 1 row. Having this Table Valued Function present in the execution plan means that the query plan chosen frequently has room for improvement. If you can convert the Multi-Statement TVF into an Inline TVF, SQL will be able to consider everything going on and will frequently produce a better performing query plan.

/ (RID Lookup / Key Lookup)

The RID and Key Lookup Operators are a bookmark lookup (you might remember the bookmark lookup operator from SQL 2000), in which a lookup is being performed into a table based upon either the Row Identifier (RID) or the Clustered Index Key. What this means is that the query has decided to utilize a non-clustered index on a table, but that index doesn’t completely cover the query (meaning that there are columns utilized in the query that are not in the index). Therefore, SQL has to take the RID (or Clustered Index key value(s)) from the non-clustered index, and perform an additional lookup operation into the table for that row to get the extra columns needed for the query. This lookup operation is performed for each row being returned. When seeing this plan operator in an execution plan, I see the potential opportunity for indexing tuning and/or query tuning to improve the performance of the query.

Seeing the Table Scan operator in a query plan also means that there is a table that we should consider adding a clustered index to.

/ (Table Scan / Clustered Index Scan)

The Table Scan and Clustered Index Scan operators tell me that SQL has decided that it needs to scan an entire table. The difference between these two operators is that the Table Scan is against a heap, while a Clustered Index Scan is against a table with a Clustered Index. Normally, when you see either of these operators, you have a query that is scanning the entire table. When seeing these operators, I’m again seeing indexing and query tuning opportunities. However, keep in mind that SQL can handle some small tables more efficiently with a Table (or Clustered Index) Scan than with other operators.

(Select)

The Select operator (referenced in BOL as a Result operator) is one that is present in all query plans for select statements. It’s also the easiest one to find – it will always be at the top left. And most people never really look at this operator – yet this operator can tell you so much about why the query plan is performing badly. That little exclamation mark in the yellow triangle tells me that there is something in this query that generated a warning. (In the case of this contrived query, it is an implicit conversion – knowing this means that I will probably see either a table scan, clustered index scan or index scan somewhere in the execution plan.) However, this operator is loaded with other information. By checking the properties, I can determine whether SQL was able to generate the best plan possible for this query, or whether it settled for one that was “good enough”… or even if it took the one with the least cost out of the combination of plans that it had looked for before it was forced to make a choice based upon time (a “timeout”). It can tell you the set options in effect for this query (well, not all – just the ones that can have an effect on how a query runs). It can even tell you the parameters that were supplied for this query, and the parameters that this query plan was initially designed for (which can help you troubleshoot parameter sniffing issues). When you are examining query plans, this is one that you don’t want to ignore.

Wrap-up

I’ve presented to you here the Plan Operators that I initially look for when I’m looking at a query for performance issues. Finding these may lead me to tweak indexes, or to perform coding changes to the query.

Thanks for this subject Rob. I’m looking forward to see what everyone else has to say about it.

11 Jun 22:15

Database Restore with Snapshot - Backup Chain Breaker

by prince rastogi

Hi Friends,

Today I will show you practical for one of the reason of Transaction Log backup chain breaker i.e. restoring database with database snapshot. Suppose I am having one database named as SQLTEST having one table named as xtSQLTest with some amount of data.

--create database
create database SQLTEST
go
USE SQLTEST
go
-- create a table
create table xtTest
(
	id int identity(1,1)
)
go
--insert some data in the table
insert into xtTest default values;
go 100

Now take full backup and insert some more data then again take first transaction log backup:

--take the full backup
backup database SQLTEST to disk='e:\Full.bak'
go
--insert some more data in the table
insert into xtTest default values;
go 100
--take first Transaction log backup
backup log SQLTEST to disk='e:\tlog1.trn'
go

Now create database snapshot here and insert some more data then again take second transaction log backup:

create database snap_SQLTEST on
(NAME=SQLTEST,
 FILENAME='E:\SQLTEST.snap')
 as snapshot of SQLTEST
 go
--insert some more data in the table
insert into xtTest default values;
go 100
--take second Transaction log backup
backup log SQLTEST to disk='e:\tlog2.trn'
go

So everything is running fine till here, now just restore the database with the snapshot which you created earlier, then again try to take third transaction log backup:

-- restore database snapshot
use master
go
restore database SQLTEST from database_snapshot='snap_SQLTEST'
go
--now taking my third Transaction log backup
backup log SQLTEST to disk='e:\tlog3.trn'
go

Here Snapshot restoration ran fine but we receive below mention error when trying to take third transaction log backup:

Msg 4214, Level 16, State 1, Line 2

BACKUP LOG cannot be performed because there is no current database backup.

Msg 3013, Level 16, State 1, Line 2

BACKUP LOG is terminating abnormally.

backupbreaker

So keep in mind that every time when you restore database snapshot, reinitialize the backup chain with full backup after that you can also take transaction log backups:

--Take New Full backup to start new backup chain
backup database SQLTEST to disk='e:\Full_AfterSnapshotRestore.bak'
go
--take first Transaction log backup
backup log SQLTEST to disk='e:\tlog1_AfterSnapshotRestore.trn'
go

If you liked the post, do like us on Facebook at http://www.facebook.com/SQLServerGeeks

Have a SQL Server question? Join the fastest growing SQL Server group on faceBook - http://www.facebook.com/groups/458103987564477/

Thanks & Regards:

Prince Kumar Rastogi

11 Jun 22:15

The Accidental DBA (Day 11 of 30): Backups: Backup Storage and Retention

by Paul Randal

This month the SQLskills team is presenting a series of blog posts aimed at helping Accidental DBAs ‘keep the SQL Server lights on’. It’s a little taster to let you know what we’ll be covering in our brand new Immersion Event for The Accidental DBA, which we’ll be presenting for the first time in September 2013 in Bellevue, WA. You can find all the other posts in this series at http://www.SQLskills.com/help/AccidentalDBA. Enjoy!

There are two very common mistakes that DBAs (accidental or otherwise) make around backups:

Storing the only copy of the backups on the same I/O subsystem as the databases
Only keeping the most recent full backup, and differential/log backups since then

Backup Storage

The worst place in the world to store your backups is on the same portion of the I/O subsystem as the databases themselves.

If that I/O subsystem has a problem that causes corruption in the databases, the odds are that the same problem will affect the backups. Similarly, if someone makes a mistake and accidentally re-formats the I/O subsystem, not only will it destroy the databases but also the backups too.

How can you recover if your backups have also been corrupted or destroyed? You can’t. Or at least you can’t restore from your backups. I’ve heard of many cases where companies have lost a lot of money, business, respect, and jobs because of backups being co-located with databases.

Backups need to at least be stored on a different I/O subsystem from the databases, on a different server. That way a localized failure will not damage the backups as well.

Even that’s not good enough though. What if the entire data center is damaged or shut down (think about catastrophic power loss without any backup power source)? If the only backups are stored onsite, recovery will not be possible. For this reason, it’s highly advisable to keep copies of your backups at an offsite location that’s easily accessible with a reasonable amount of time.

Note that I said ‘copoes of your backups at an offsite location’. You shouldn’t have your backups stored only offsite, because most of the time a disaster is localized and the entire data center is not affected. You just need to cope with the case when it is. So you should have onsite and offsite copies of your backups, to allow for fast access to cope with localized problems as well as slower access when a more comprehensive disaster occurs.

A common backup storage strategy involves first backing up locally to disk storage, to a network share or to local storage and then copied to a network share, and then copying the backups to an archive system onsite plus creating copies to ship offsite.

Whatever your strategy becomes, just don’t have only a single, local copy of your backups, stored with the databases; that’s the worst possible way to do it.

Backup Retention

One of the issues you may have is that you only have limited onsite and offsite storage space for your backups – so how long do you keep your backups for?

Imagine a timeline of backups: F1 L1 L2 D1 L3 L4 D2 L5 L6 D3 L7 L8 F2 L9 L10 D4 L11 L12 D5 L13 L14 D6 L15 L16

You may wonder why you need to keep any backup prior to the most recent full backup, F2. Think about this – what if the F2 full backup is corrupt?

If you haven’t kept any backups prior to F2, without the F2 backup to start your restore sequence, all the backups you have after F2 are completely useless.

If you can’t use the most recent full backup, the next best thing is the full backup you took before that, F1. And to use that full backup, you’ll also need the most recent differential backup that was based on that full backup, D3, plus log backups L7 and L8, to get you up to the point in time around F2, and then log backups L9 to L16 (because differential backups D4-D6 are based off F1, not F2, and so cannot be used).

As you can see, having a corrupt full backup also means having to restore a lot more backups, which will likely blow your downtime requirement, but that should be a rare case and an exception should be able to be made. You just need to make sure you have the older backups.

So, again, how long do you keep your backups for?

If I had a backup strategy that was weekly full backups, daily differential backups, and log backups every 30 minutes, I would try to keep at least a month’s worth of backups at all times onsite, with three months of backups stored offsite. Do you think that sounds excessive? Unless you’re testing a multitude of restore sequences every week using all the older backups, which is rarely possible, you need to have ‘defense in depth’ – multiple possible alternate restore strategies if multiple older backups prove to be damaged or invalid.

Summary

This was the last post in our Accidental DBA series about SQL Server backups. Over the last six posts I’ve explained:

What the downtime (RTO) and data loss (RPO) requirements are, and why they’re critical to know for your databases
The three recovery models that can be used, and how they affect the log and log backups
The three main types of backups (full, differential, and log) and what they’re used for
How to design a restore strategy that takes into account the RPO and RTO requirements, which then lets you decide what backups you need to perform
Commonly used BACKUP options and why they’re useful
The importance of validating backups to make sure that when you need them, they’re usable
The importance of storing multiple copies of backups, both onsite and offsite, so a disaster doesn’t result in complete data loss
Finally, the importance of retaining older backups so your disaster recovery strategy can cope with some backups being invalid

I hope this has given you a good grounding in the importance of having a good, well-tested backup strategy. Protecting the data from loss and damage is really one of the most important things that a DBA does.

The post The Accidental DBA (Day 11 of 30): Backups: Backup Storage and Retention appeared first on Paul S. Randal.

11 Jun 22:09

What Can You Find Out From Metadata?

by samzenpus

cervesaebraciator writes "In the wake of recent revelations from Edward Snowden, apologists for the state security apparatus are predictably hitting the airwaves. Some are even 'glad' the NSA has been doing this. A major point they emphasize is that the content of calls have remained private and it is only the metadata that they're interested in. But given how much one can tell from interpersonal connections, does the surveillance only represent a 'modest encroachments on privacy?' It is easy enough to imagine how metadata on phone calls made to and from a medical specialist could be more revealing than we'd like. But social network analysis can reveal far more. Duke sociologist Kieran Healy, in a light-hearted but telling article, shows how one father of the American Revolution could have been identified using the simplest tools of social network analysis and only a limited dataset."

Microsoft works with financial services industry leaders, law enforcement and others to disrupt massive financial cybercrime ring

by Jeffrey Meisner

The following is a post from Richard Domingues Boscovich, Assistant General Counsel, Microsoft Digital Crimes Unit.

In our most aggressive botnet operation to date, the Microsoft Digital Crimes Unit worked with leaders of the financial services industry, other technology industry partners and the Federal Bureau of Investigation to disrupt a massive cyber threat responsible for stealing people’s online banking information and personal identities. With a court ordered civil seizure warrant from the U.S. District Court for the Western District of North Carolina, Microsoft executed a simultaneous operation to disrupt more than 1,400 Citadel botnets which are responsible for over half a billion dollars in losses to people and businesses worldwide.

Meanwhile, the FBI took coordinated separate steps related to the operation. This collaborative action – codenamed Operation b54 – is Microsoft’s seventh botnet operation to date and part of a growing proactive effort by both the public and private sector to fight cybercrime, help protect people and businesses from online fraud and identity theft, and enhance cloud security for everyone. This operation marks the first time that law enforcement and the private sector have worked together in this way to execute a civil seizure warrant as part of a botnet disruption operation.

As Reuters first reported, due to Citadel’s size and complexity, we do not expect to fully take out all of the botnets in the world using the Citadel malware. However, we do expect that this action will significantly disrupt Citadel’s operation, helping quickly release victims from the threat and making it riskier and more costly for the cybercriminals to continue doing business.

Similar to past botnet efforts, Microsoft will use the intelligence gained in Operation b54 to work with Internet Service Providers (ISPs) and Computer Emergency Response Teams (CERTs) around the world to quickly and efficiently clean as many computers as possible. Microsoft will be making this information available through its Cyber Threat Intelligence Program (C-TIP), including the recently-announced cloud-based version of the program. For computer owners worried that their computers might be infected, Microsoft offers free information and malware removal tools at http://support.microsoft.com/botnets. Additionally, the FBI provides information on its website about botnets to educate the public on how to protect themselves. Many financial services industry organizations also provide resources, tips and tools to individuals and companies on how to protect themselves.

Like many of our past operations, this investigation once again revealed how criminals are adapting and evolving their attack methods in order to continue to infect people’s computers with malware. For instance, during our investigation we found that Citadel blocked victims’ access to many legitimate anti-virus/anti-malware sites, making it so people may not have been able to easily remove this threat from their computer. However, with the disruptive action, victims should now be able to access these previously blocked sites. We also found that cybercriminals are using fraudulently obtained product keys created by key generators for outdated Windows XP software to develop their malware and grow their business, demonstrating another link between software piracy and global cybersecurity threats. (Of note, Windows Vista, Windows 7 and Windows 8 have measures in place to help protect against this type of misuse of product keys.) This discovery showcases that, in addition to exercising safe online practices like running updated and legitimate software and using firewall and antivirus protection, people also need to use modern versions of Windows software to better prevent malware, fraud and identify theft.

Cooperation is the key to winning the fight against cybercrime, and I’m excited about the opportunity we had to work with law enforcement and the other partners involved in this operation and the impact of that cooperative effort. Operation b54 serves as a real world example of how public-private cooperation can work effectively within the judicial system, and how 20^th century legal precedent and common law principles dating back hundreds of years can be effectively applied toward 21^st century cybersecurity issues. Building on recent remarks from government leaders, including President Barack Obama, U.S. Senator Sheldon Whitehouse, Congressman Mike Rogers, and others calling for increased public-private cooperation to combat cyber threats, I look forward to similar cooperative efforts in the future as we continue our goal to fundamentally disrupt the cybercriminal ecosystem.

I’d like to thank the FBI, the U.S. Marshals, as well as all the financial and technology industry partners that we worked with on this operation, including the Financial Services – Information Sharing and Analysis Center (FS-ISAC), NACHA – The Electronic Payments Association, the American Bankers Association (ABA), Agari, A10 Networks and Nominum.

For more information on this operation and comment from others involved, please read our press release.

Meanwhile, this case and operation are ongoing, and we’ll continue to provide updates as they become available. To stay up-to-date on the latest developments on the fight against cybercrime, follow the Microsoft Digital Crimes Unit on Facebook and Twitter.

10 Jun 19:09

Build and Restore Snapshots in Hyper-V with Windows PowerShell

by The Scripting Guys

Summary: Microsoft PowerShell MVP, Sean Kearney, talks about using Windows PowerShell to build and restore snapshots in Hyper-V.

Microsoft Scripting Guy, Ed Wilson, is here. If you are a seasoned Hey, Scripting Guy! Blog reader, you know that the most frequent guest blogger is Sean Kearney. If you are new to the blog, I welcome you, and I encourage you to catch up with Sean’s previous blogs.

Sean is a Windows PowerShell MVP and an Honorary Scripting Guy. Sean has been selected to present sessions called Integrating with Microsoft System Center 2012 and Windows PowerShell at TechEd NA and TechEd Europe this year. In his free time, Sean has written several blog posts about Hyper-V and some other cool stuff, and for the next few weeks, Sean will be the designated guest blogger on Fridays. Take it away Sean…

With Microsoft TechEd Europe and a few other presentations on the horizon, I was thinking about how nice it would be have my environments self manage for the demos.

Typically with most demos in a virtual environment, you make a snapshot of everything in the Gold setup. Then when you enter into your presentations, you can immediately roll back to the original configuration to prepare for your next presentation.

Or perhaps you’re firing up a test environment, and you need to be able to take a snapshot and restore sets of machines to troubleshoot a package issue.

With Windows Server 2012 and the new Hyper-V module, this is as natural as breathing.

Let’s say we have a list of virtual machines in our environment: a domain controller, a SQL Server, and perhaps our web application. We’ll name them DC1, SQL1, WEB1. Although these are the NetBios names, for convenience I have given them the same names in Hyper-V.

$VMNames=”DC1”,”SQL1”,”WEB1”

To create a checkpoint on the virtual machines, we can simply use the Checkpoint-VM cmdlet and run it like this:

CHECKPOINT-VM –Name DC1 –Snapshotname ‘Base’

This will not only create a snapshot, but it will have a unique name that we can refer to. To run this against a list of machines, we can pipe the list by using Get-VM or supply a list like in our variable:

$VMNames | GET-VM | CHECKPOINT-VM –SnapshotName ‘Base’

This will create a snapshot named “Base” on the virtual machines named DC1, SQL1, and WEB1.

If we’d like to access all of these snapshots, we can run this command:

$VMNames | GET-VM | GET-VMSnapshot –Name ‘Base’

We can now take this list and restore them:

$VMNames | GET-VM | GET-VMSnapshot –Name ‘Base’ | RESTORE-VMSnapshot –confirm:$False

One line and now I can restore any demo by using Windows Powershell. With very minor changes, I can create multiple snapshots with unique names by changing the name of snapshot.

Cool, eh?

Feel the Power within you.

~Sean
The Energized Tech

Great job, Sean. Join us tomorrow for more cool Windows PowerShell stuff.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy

10 Jun 19:09

How to Enable User-Based Control/Enforcement of BitLocker on Removable Data Drives

by Doug Symalla

Hi readers, John Sanz here. A customer of mine came to me with a business requirement that couldn’t be satisfied out of the box, so I thought I would share the solution with you.

The business requirement was that they needed to be able to enforce BitLocker encryption on Removable drives by users, instead of by computers. Specifically, they needed to address the following 3 distinct user group scenarios:

1. Subset of Users should only be able to have Read-Only access to removable drives, whether the drive is encrypted or not.

2. Subset of Users should be able to have Read-Only access to removable media that is not encrypted, and Read/Write access to removable drives that are BitLocker encrypted. In addition, if they connect a drive that is not encrypted they should get prompted to encrypt the drive. If they choose to encrypt the drive, they should then have Read/Write access.

3. Subset of Users should be able to have Read/Write access to all removable drives, regardless of encryption.

Out of the box, it is already possible to do this via Group Policy, however only based on Computers (policy settings only exist under Computer Configuration) and the customer needed to do this based on Users… These are the policy settings under Computer Configuration:

To accomplish this, we decided to leverage Group Policy Preferences settings under User Configuration, which have the ability of setting registry values under both HKEY_CURRENT_USER and HKEY_LOCAL_MACHINE. Now, because we are configuring HKLM settings via User preferences, these would normally stick and apply to the next user to log in. To avoid this behavior, for every GPP setting, there has to be an exact inverse (mutually exclusive) targeted item to unset the value in question (as depicted in the Item Level Targeting screenshots below). This way, if users satisfy ‘condition x’ then that means that all other users automatically satisfy ‘condition y’ which unsets the value in question. Here are the specifics of the groups and preference settings we implemented:

GROUPS:

- BITLOCKER_PROMPT

- BITLOCKER_RESTRICT

GROUP POLICY PREFERENCES:

Note, all of the registry values being configured here are REG_DWORD values under the “[HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\FVE]”

- RDVAllowBDE - set the value to 1 if user is a member of the BITLOCKER_PROMPT user group

- RDVAllowBDE - set the value to 0 if user is not a member of the BITLOCKER_PROMPT user group

This preference setting has the effect of causing members of the BITLOCKER_PROMPT user group to be prompted to encrypt a removable drive that is not already encrypted. For all users that are not members of the BITLOCKER_PROMPT user group, they will not be prompted to encrypt the dive.

- RDVConfigureBDE - set the value to 0 if the user is not a member of BITLOCKER_RESTRICT

- RDVConfigureBDE - set the value to 1 if the user is a member of BITLOCKER_RESTRICT

- RDVDenyWriteAccess - set the value to 0 if the user is not a member of BITLOCKER_RESTRICT

These preference settings have the effect of removing control of removable drive encryption from Bitlocker. This means that users that are not members of BITLOCKER_RESTRICT will be able to Read/Write to all removable drives without getting prompted to encrypt if the drive is unencrypted, whereas users that are members of the BITLOCKER_RESTRICT will get prompted to encrypt removable drives that are not encrypted and will be have Read-Only access to unencrypted drives and Read/Write to encrypted drives.

In case any of you are wondering whether or not this same approach can be utilized for any number of other Computer Configuration policy settings, the answer is yes, but not all of them. Whether or not this approach will work, depends on how and when the affected component/application reads in the settings from the registry. In the case of enforcing Bitlocker Drive Encryption requirements on Removable Drives, those settings are read in each and every time a drive is connected to the system, which is why this approach works.

I hope you found this helpful and informative, and if you have any questions, just ask…

John Sanz

10 Jun 19:07

The Myth that DROP and TRUNCATE TABLE are Non-Logged

by Paul Randal

Paul Randal (@PaulRandal) of SQLskills.com discusses the myth that DROP and TRUNCATE operations are not logged, and shows that this is simply not true.

10 Jun 19:06

Edgenet Gain Real-Time Access to Retail Product Data with In-Memory Technology

by SQL Server Team

Edgenet provides optimized product data for suppliers, retailers, and search engine. Used online and in stores, Edgenet solutions ensure that businesses and consumers can make decisions based on timely, accurate product information. And that speed of information is critical - when a customer is searching online, they have to have a great experience, or they are just going to go to another website to buy the product they’re looking for. The experience needs to be very concise and give them the information they want in a quick manner.

Edgenet was already familiar with the advantages of Microsoft client in-memory technologies such as PowerPivot and were keen to see how they could extend the advantages to their OLTP systems. Edgenet realized it could take advantage of In-Memory OLTP to improve throughput and eliminate read-write locks. The company decided to become an early adopter of the technology, which would be built into SQL Server 2014. As Michael Steineke, VP of IT says ‘We knew In-Memory OLTP would be a game changer for us’.

The traditional system was updated once a day, and the process took a few hours. Using In-Memory OLTP the new system can be done 7 times faster in about 20 minutes, and the system isn’t down while it’s being updated. It’s all done real-time!

As well as improved speed, Edgenet were also pleased that In-Memory OLTP uses the same tool set that the developers and DBAs are used to. To them it looks, basically, like standard SQL Server, even though under the covers it’s completely different.

The company also expects to increase customer satisfaction with real-time access to information. “Our reputation is based on providing highly accurate and clean data for products,” says Steineke. “With In-Memory OLTP in SQL Server 2014, we can ensure that we’re giving the correct data to the end-customer or sales associate. It’s all about being presented with the right data, at the right time, and in a form that’s easy to use.”

10 Jun 19:04

SQL Server 2014: A Closer Look

by SQL Server Team

Microsoft SQL Server 2014 was announced by Quentin Clark during the Microsoft TechEd 2013 keynote. Designed and developed with our cloud-first principles in mind, SQL Server 2014 builds on the momentum of SQL Server 2012, released just 14 months ago. We are excited to share a closer look at some of the exciting capabilities included in SQL Server 2014 that will help you unlock real-time insights with mission critical and cloud performance.

SQL Server 2014 helps organizations by delivering:

Mission Critical Performance across all database workloads with In-Memory for online transaction processing (OLTP), data warehousing and business intelligence built-in as well as greater scale and availability
Platform for Hybrid Cloud enabling organizations to more easily build, deploy and manage database solutions that span on-premises and cloud
Faster Insights from Any Data with a complete BI solution using familiar tools like Excel

Mission Critical Performance with SQL Server 2014

SQL Server 2014 delivers new in-memory capabilities built into the core database for OLTP and data warehousing, which complement existing in-memory data warehousing and business intelligence capabilities for a comprehensive in-memory database solution. In addition to in-memory, there are new capabilities to improve the performance and scalability for your mission critical applications.

In Memory Built-In

New In-Memory OLTP – built in to core SQL Server database and uniquely flexible to work with traditional SQL Server tables allowing you to improve performance of your database applications without having to refresh your existing hardware. We are seeing customers such as EdgeNet and bwin achieve significant performance gains to scale and accelerate their business.
Enhanced In-Memory ColumnStore for Data Warehousing – now updatable with even faster query speeds and with greater data compression for more real-time analytics support.
New buffer pool extension support to non-volatile memory such as solid state drives (SSDs) – Increase performance by extending SQL Server in-memory buffer pool to SSDs for faster paging.
New Enhanced Query Processing – speeds all SQL Server queries regardless of workload.

Enhanced Availability, Security and Scalability

Enhanced AlwaysOn – Built upon the significant capabilities introduced with SQL Server 2012, delivers mission critical availability with up to 8 readable secondaries and no downtime during online indexing operations.
Greater scalability of compute, networking and storage with Windows Server 2012 R2 –

- Increased scale – Continue to benefit from scale for up to 640 logical processors and 4TB of memory in a physical environment and up to 64 virtual processors and 1TB of memory per VM.
- Network Virtualization – Abstracts networking layer so that it you can easily migrate SQL Server from one datacenter to another.
- Storage Virtualization with Storage Spaces – Create pools of storage and storage tiers allowing your hot data to access the premium storage and cold data to access standard storage improving resilience, performance and predictability.

Enhanced Resource Governance – With Resource Governor, SQL Server today helps you with scalability and predictable performance, and in SQL Server 2014, new capabilities allow you to manage IO, in addition to compute and memory to provide more predictable performance.
Enhanced Separation of Duties – Achieve greater compliance with new capabilities for creating role and sub-roles. For example, a database administrator can now manage the data without seeing sensitive data or personally identifiable information.

Platform for Hybrid Cloud

SQL Server 2014 creates a strong platform for hybrid cloud where cloud scale can be leveraged to extend the scalability and availability of on-premises database applications as well as reduce costs.

Simplified Cloud Backup and Disaster Recovery

Backup to Azure Storage – reduce costs and achieve greater data protection by backing up your on-premises database to Azure Storage at an instance level. Optimize backup policy with intelligence built in to SQL Server that monitors and tracks backup usage patterns to provide optimal cloud backup. Backups can be automatic or manual, and in case of an on-premises failure, a backup can be restored to a Windows Azure Virtual Machine.
AlwaysOn integration with Windows Azure Infrastructure Services – Benefit from Microsoft’s global data centers by deploying a Windows Azure Virtual Machine as an AlwaysOn secondary for cost-effective global data protection. Increase performance and scale reporting for your global business units by running reporting off the readable secondaries in Windows Azure. Run backups on the secondaries in Windows Azure to increase data protection and performance.
SSMS Wizard for deploying AlwaysOn secondaries in Window Azure – Easily deploy an AlwaysOn secondaries to Windows Azure Virtual Machine with a point and click experience within SQL Server Management Studio (SSMS).

Easy Migration of On-Premises SQL Servers to Windows Azure Virtual Machines

SSMS Migration Wizard for Windows Azure Infrastructure Services – Easily migrate an on-premises SQL Server database to a Windows Azure Virtual machine with a point and click experience in SSMS. The newly deployed database application can be managed through SSMS or System Center 2012 R2.

Faster Insights on Any Data

SQL Server 2014 is at the heart of our modern data platform which delivers a comprehensive BI solution that simplifies access to all data types big and small with additional solutions like HDInsight, Microsoft’s 100% Apache compatible Hadoop distribution and project code name “Data Explorer”, which simplifies access to internal or external data. New data platform capabilities like Polybase included in Microsoft Parallel Data Warehouse allows you to integrate queries across relational and non-relational data using your existing SQL Server skills.

With SQL Server 2014, you can accelerate insights with our new in-memory capabilities with faster performance across workloads. You can continue to refine and manage data using Data Quality Services and Analysis Services in SQL Server and finally analyze the data and unlock insights with powerful BI tools built into Excel and SharePoint.

Learn more and sign up for the Preview

SQL Server 2014 brings to market many new exciting capabilities that will deliver tremendous value to customers. SQL Server 2014 can help you unlock real-time insights with mission critical and cloud performance along with one of the most comprehensive BI solutions in the marketplace today.

If you are at TechEd North America and want to learn more about SQL Server 2014 you will not want to miss the following key sessions:

Thursday, June 6^th: DBI-B317: Microsoft SQL Server Future and Features
Wednesday, June 5^th: DBI-B306: Microsoft SQL Server High Availability and Disaster Recovery on Windows Azure

Learn more about SQL Server 2014 and download new datasheet and whitepapers here.

Also if you haven’t already sign up for the SQL Server 2014 CTP 1 bits coming in a few weeks!

10 Jun 19:03

Geek City: Document more trace flags?

by Kalen Delaney

It’s been over 5 years since I last blogged about trace flags , so it seems a post on the topic is in order. If a trace flag is undocumented, it means it is unsupported and not guaranteed. It can go away at any time, even from one service pack to the...(read more)

10 Jun 18:55

How It Works: The I/O Path: SQL Server Running in Windows Azure Virtual Machine (IaaS)

by psssql

Note: This blog is based on behavior as of June 2013. At Microsoft we continue to evolve and enhance our products so the behavior may change over time.

The I/O path for SQL Server, running on Windows Azure Virtual Machines, uses Windows Azure storage (often referred to as XStore.) The following is a link to a whitepaper, written by the SQL Server development team, explaining various I/O aspects and tuning needs for this environment:

http://go.microsoft.com/fwlink/?LinkId=306266

I have been testing and driving workloads using SQL Server in our IaaS environment(s). Exercising test patterns from SQLIOSim, T-SQL, Tempdb stress, BCP, and the RML utility test suites. This work has afforded me the opportunity to work closely with both the SQL Server and Windows Azure development teams to better understand how to tune SQL Server for a Windows IaaS deployment.

The previously mentioned white paper points out many of the interactions but I would like to dive into some specific areas to help you understand your Windows Azure IaaS, SQL Server deployment capabilities and options.

Caching

Caching happens at multiple levels (SQL Server, VHD/VHDx, XStore, …). SQL Server requires stable media (data retained across power outages) in order to uphold the ACID properties of the database. It should not surprise you that I spent a fair amount of time understanding the various caches and how SQL Server I/O patterns interact with them from both data safety/stability and performance angles.

SQL Server Data and Log Files: FILE_FLAG_WRITETHROUGH/Forced Unit Access (FUA) Writes

SQL Server does all database and log file writes (including TEMPDB) forcing write though. This means the system is not allowed to return the success of a WriteFile* operation until the system guarantees the data has been stored in stable media. This is regardless of the cache settings at any level of the system. The data must be stored in a cache that is battery backed or written to physical media that will survive a power outage.

SQL Server Backups

When performing a backup SQL Server does NOT set the FILE_FLAG_WRITETHOUGH, instead the backup allows caching of the data to take place. When backup has written all of the data it issues the FlushFileBuffers command on the backup file(s). This acts like FUA in that the FlushFileBuffers must guarantee all data stored in cache has been written to stable media before returning success from the API.

Other Files

There are other files such as the SQL Server error log, BCP output, … that a SQL Server deployment will use. These usually do NOT use FILE_FLAG_WRITETHROUGH nor FlushFileBuffers. This behavior is not unique to Windows IaaS it has been the design and implementation for many releases of the SQL Server product. I am only mentioning this because it can make a difference to you as to where you store the data or leverage various caching mechanisms.

Drive: VHD/VHDx Cache (Drive Caching)

The first level of caching appears at the drive level. Accessing the drive’s properties exposes the drive level caching policy. When enabled the drive is allowed to cache data within the physical drive cache (for physical disks) and virtual implementations.

Drive: XStore Caching

The virtual disks attached to your Windows IaaS VM are stored in Windows Azure storage (the XStore.) The XStore provides host level caching and performance optimizations for your VM. XStore caching is controlled outside of the individual drive cache settings, previously shown. (Drive level cache settings do not impact the XStore cache settings.)

There are three cache settings (None, ReadOnly, ReadWrite) for the XStore. By default a data disk is set to NONE and an OS Disk is set to Read Write. The data disk can be configured to any of the 3 settings while an OS disk is limited to the Read Only or Read Write options.

To alter the XStore cache settings you must use a command such as the following: (The cmdlet’s are part of the Windows Azure Management cmdlets package.)

get-azurevm 'TestSub' 'TestMach' | set-azureosdisk -HostCaching "ReadOnly" | update-azurevm

Note: The configuration change does not take place until the VM is restarted.

You will find that most documentation points to the use of one or more data drives for SQL Server database and log files and I agree with this assessment but I would like to explain a bit more as to why I agree.

OS Drive Default: Read Write

To understand the recommendation it helps to understand a bit of how the XStore cache is implemented. The XStore cache can use a combination of the host’s RAM as well as disk space, on the local host to support the cache. When a read or write takes place the local cache is consulted. As you can imagine, the local cache can be faster than making a request to the Windows Azure storage cluster. To see this in action it helps if I show you some high level scenarios.

Note: These scenarios are designed to provide you a 10,000 foot view of the architecture.

Action	FUA	XStore Read Cache	XStore Write Cache	Outcome
WriteFile*	Yes	Yes	Yes	Is covering block in XStore write cache (No) Obtain cache block (could require write of an older block to XStore before reuse is allowed) Issue XStore read for block ~512K and store in local cache (write) Do original write to local cache Do original write to XStore Data is written all the way to the XStore (stable media.)
	No	Yes	Yes	Is covering block in XStore write cache (No) Obtain cache block (could require write of an older block to XStore before reuse is allowed) Issue XStore read for block ~512K and store in local cache (write) Do original write to local cache Data is NOT in stable media, will be written by XStore caching, LRU algorithms. This is not used by the SQL Server database or log files because FUA=YES but BCP out is a usage example. This may be helpful because the BCP can leverage the cache and allow the cache to optimally send data to the XStore. Note: For temporary files, such as BCP not TEMPDB, you may consider the (D:) scratch drive, provided. It is local and no XStore caching is involved.
	Yes	Yes	No	The write is propagated directly to the XStore.
	Yes	No	No	The write is propagated directly to the XStore.
ReadFile*	N/A	Yes	Yes	Is covering block in XStore read cache (No) Obtain cache block (could require reclaim of an older block before reuse is allowed) Issue XStore read for block ~512K and store in local cache (write) Provide portion of block requested to reader If the block is present in the read cache the portion of the block requested to the reader is directly serviced.

The SQL Server use cases become more clear once you understand the XStore caching behavior and combine it with SQL Server caching behavior.

SQL Server Log File

The log file is typically a serially moving, write only entity. SQL Server maintains a cache of log block to avoid physical I/O for actions such as a transaction rollback or provide data to Always On, Database Mirroring, Replication, … If you place the log on a drive that allows XStore write caching you are seldom taking advantage of the write cache. First the log is written with FUA enabled so a write has to go though the cache, all the way to the XStore. Secondly, since the log has a much higher ratio of writes to reads you are likely not using the XStore cache effectively but forcing local XStore cache writes for every backend, FUA XStore write.

SQL Server Data File

The data file is opened with FUA enabled and the SQL Server uses a large buffer pool to maintain database pages in cache. The larger your VM size the more buffer pool cache SQL Server can take advantage of, avoiding physical I/O activities.

The same write behavior is true for database pages as described in the log section above. When you then apply the XStore read cache capabilities, they may not be performance improving over what SQL Server is already caching in buffer pool. As shown in the scenario table a read, via XStore read cache enablement, can result in a fetch of a larger block into the local XStore read cache. This could be helpful for subsequent SQL Server read requests but you also incur a possible write to the local XStore cache to maintain the data in the XStore cache. Your application read pattern may also be sporadic and defeat the intent of the XStore read cache.

BCP Out

You may get an advantage of writing to a XStore, write enabled drive by allowing the XStore to cache and optimially flush information to the backend store.

BCP In

This is one of the tests that the XStore read cache improved performance for. The read ahead action of the larger blocks used by the XStore allowed the streaming of the read bytes to be faster than from a XStore drive with read caching disabled.

TEMPDB Scratch Drive – No Thanks

It can be a bit confusing that you have a (D: scratch) drive so why not use it for TEMPDB. The reason is that scratch drive is a shared resource between all VMs on the host. You are given a sandbox drive in the VM so others can’t see your data and you can’t see theirs but the physical media is shared. While the system attempts to avoid it, it does mean a noisy neighbor could impact consistent I/O throughput on the scratch drive and change your performance predictability.

Replicas of Data

The Windows Storage for the VHDs atomically replicates your data to 3 separate, physical media destinations. This is done during the write request in a way that the write is assured quorum to the devices before the write is considered complete, proving your VHDs with a high degree of data storage safety.

Remote Replicas (Geo-Replication)

The default for the VHD storage is to provide 3 local replicas of the data. A remote replica can also be established. The remote replica is currently NOT safe for SQL Server use. The remote replica is maintained asynchronously and the system, currently, does not provide the ability to group VHDs into a consistency group. Without consistency groups it is unsafe to assume SQL Server, with files on multiple VHDs maintain the write ordering across all the VHDs and as such the database won’t be recoverable.

At the current time you should not leverage the remote, Windows Azure storage replication capabilities for SQL Server as it is not supported. You should leverage SQL Server technologies that provide the capability (Always On, Database Mirroring, Log Shipping, Backup to Blob Storage, …)

Recap / Recommendation

What am I trying to say? – Test! As pointed out in the SQL Server whitepaper, the XStore is perfectly safe for SQL Server ACID requirements. However, the types of I/O pattern(s) your application(s) drive dictate how you can leverage the XStore caching capabilities.

Most implementations start with the SQL Server database and log files on a data drive, XStore caching disabled and only after testing enable various levels of XStore, caching behavior.

Bob Dorr - Principal SQL Server Escalation Engineer

10 Jun 18:54

Cloud Computing Architecture Patterns: Don’t Focus on the Client

by BuckWoody

Normally I try to put topics in the positive in other words "Do this" not "Don't do that". Sometimes its clearer to focus on what *not* to do. Popular development processes often start with screen mockups, or user input descriptions. In a scale-out pattern like Cloud Computing on Windows Azure, that's the wrong place to start.

Start with the Data

Instead, I recommend that you start with the data that a process requires. That data might be temporary or persisted, but starting with the data and its requirements helps to define not only the storage engine you need but also drives everything from security to the integrity of the application. For instance, assume the requirements show that the user must enter their phone number, and that this datum is used in a contact management system further down the application chain. For that datum, you can determine what data type you need (U.S. only or International?) the security requirements, whether it needs ACID compliance, how it will be searched, indexed and so on. From one small data point you can extrapolate out your options for storing and processing the data. Here's the interesting part, which begins to break the patterns that we've used for decades: all of the data doesn't have the same requirements. The phone number might be best suited for a list, or an element, or a string, with either BASE or ACID requirements, based on how it is used. That means we don't have to dump everything into XML, an RDBMS, a NoSQL engine, or a flat file exclusively. In fact, one record might use all of those depending on the use-case requirements.

Next Is Data Management

With the data defined, we can move on to how to store the data. Again, the requirements now dictate whether we need a full relational calculus or set-based operations, or we can choose another method based on the requirements for the data. And breaking another pattern its OK to store in more than once, in more than one location. We do this all the time for reporting systems and Business Intelligence systems, so this is a pattern we need to think about even for OLTP data.

Move to Data Transport

How does the data get around? We can use a connection-based method, sending the data along a transport to the storage engine, but in some cases we may want to use a cache, a queue, the Service Bus, or Complex Event Processing.

Finally, Data Processing

Most RDBMS engines, NoSQL, and certainly Big Data engines not only store data, but can process and manipulate it as well. Its doubtful that you'll calculate that phone number right? Well, if you're the phone company, you most certainly will. And so we see that even once we've chosen the data type, storage and engine, the same element can have different computing requirements based on how it is used.

Sure, We Need A Front-End At Some Point

Not all data is entered by human hands in fact most data isn't. We don't really need a Graphical User Interface (GUI) we need some way for a GUI to get data into and out of the systems listed earlier.

But when we do need to allow users to enter or examine data, that should be left to the GUI that best fits the device the user has. Ever tried to use an application designed for a web browser on a phone? Or one designed for a tablet on a phone? Its usually quite painful. The siren song of "We'll just write one interface for all devices" is strong, and has beguiled many an unsuspecting architect. But they just don't work out.

Instead, focus on the data, its transport and processing. Create API calls or a message system that allows for resilient transport to the device or interface, and let it do what it does best.

References

Microsoft Architecture Journal: http://msdn.microsoft.com/en-us/architecture/bb410935.aspx
Patterns and Practices: http://msdn.microsoft.com/en-us/library/ff921345.aspx
Windows Azure iOS, Android, Windows 8 Mobile Devices SDK: http://www.windowsazure.com/en-us/develop/mobile/tutorials/get-started-ios/
Windows Azure Facebook SDK: http://ntotten.com/2013/03/14/using-windows-azure-mobile-services-with-the-facebook-sdk-for-windows-phone/

10 Jun 18:53

Help Me Help You Fix That

by BuckWoody

If you've been redirected here because you posted on a forum, or asked a question in an e-mail, the person wanted you to know how to get help quickly from a group of folks who are willing to do so - but whose time is valuable. You need to put a little effort into the question first to get others to assist. This is how to do that. It will only take you a moment to read...

1. State the problem succinctly in the title

When an e-mail thread starts, or a forum post is the "head" of the conversation, you'll attract more helpers by using a descriptive headline than a vague one.

This: "Driver for Epson Line Printer Not Installing on Operating System XYZ"

Not this: "Can't print - PLEASE HELP"

2. Explain the Error Completely

Make sure you include all pertinent information in the request. More information is better, there's almost no way to add too much data to the discussion. What you were doing, what happened, what you saw, the error message, visuals, screen shots, whatever you can include.

This: "I'm getting error '5203 - Driver not compatible with Operating System since about 25 years ago' in a message box on the screen when I tried to run the SETUP.COM file from my older computer. It was a 1995 Compaq Proliant and worked correctly there.."

Not this: "I get an error message in a box. It won't install."

3. Explain what you have done to research the problem

If the first thing you do is ask a question without doing any research, you're lazy, and no one wants to help you. Using one of the many fine search engines you can most always find the answer to your problem. Sometimes you can't. Do yourself a favor - open a notepad app, and paste the URL's as you look them up. If you get your answer, don't save the note. If you don't get an answer, send the list along with the problem. It will show that you've tried, and also keep people from sending you links that you've already checked.

This: "I read the fine manual, and it doesn't mention Operating System XYZ for some reason. Also, I checked the following links, but the instructions there didn't fix the problem: "

Not this: <NULL>

4. Say "Please" and "Thank You"

Remember, you're asking for help. No one owes you their valuable time. Ask politely, don't pester, endure the people who are rude to you, and when your question is answered, respond back to the thread or e-mail with a thank you to close it out. It helps others that have your same problem know that this is the correct answer.

This: "I could really use some help here - if you have any pointers or things to try, I'd appreciate it."

Not this: "I really need this done right now - why are there no responses?"

This: "Thanks for those responses - that last one did the trick. Turns out I needed a new printer anyway, didn't realize they were so inexpensive now."

Not this: <NULL>

There are a lot of motivated people that will help you. Help them do that.

10 Jun 18:47

Are We Losing a Standard (Edition) Data Recovery Technology?

by AllenMWhite

One of the coolest technologies Microsoft released with SQL Server 2005 was Database Mirroring, which provided the ability to have a failover copy of a database on another SQL Server instance, and have the ability to automatically failover to that copy...(read more)

08 Jun 20:33

How to delete Open or Insecure Wi-Fi HotSpots from Windows 8: Wifi.exe Command Line Utility with Source

by Scott Hanselman

For the most part I'm happy with Windows 8 but one feature was removed that makes no sense to me - the wireless networks dialog.

Sure, you can "Forget this network" by right clicking on a Wi-Fi Connection, but only when that network is in range. The old Wireless Networks dialog where you could add and remove networks is gone. Who knows how many saved Wi-Fi hotspot profile I have littering my system?

So, The Problem: I want to remove saved Wi-Fi Profiles whenever I feel like it. I wrote a command line util that will work in Windows 7 and Windows 8.

TL;DR Version

There's a build zipped up of Wifi.exe available here and the source is on GitHub.

UPDATE: I've put Wifi-Manager up on Chocolately so you can now "cinst wifi-manager." Thanks to Brendan Forster for the heavy lifting! Learn more about the Chocolatey package manager here!

Caveats and "Ya I know."

First, let me be clear that I have written a command line utility to replace another command line utility. I get it. I knew it when I did it. Others smarter than I have done similar things and written utilities that match their way of thinking rather than learning an unintuitive syntax. Don't hate the playa, hate the Regular Expression.

Aside: This is also a problem with my iPhone. I likely have 50+ saved Wi-Fi spots on my phone and no way to delete them without jail-breaking.

You can access Wi-Fi profiles with the netsh.exe that comes with Windows, so you could list out profiles like this:

c:\>netsh wlan show profiles


Profiles on interface Wi-Fi:


User profiles

-------------

    All User Profile     : Wayport_Access

    All User Profile     : HANSELMAN

    All User Profile     : HANSELMAN-N

    All User Profile     : HanselSpot

    All User Profile     : EliteWifi

    All User Profile     : Qdoba Free Wifi

Then, for each one, call

c:\>netsh wlan show profile "Qdoba Free Wifi"


Profile Qdoba Free Wifi on interface Wi-Fi:

=======================================================================


Profile information

-------------------

    Version                : 1

    Type                   : Wireless LAN

    Name                   : Qdoba Free Wifi

    Control options        :

        Connection mode    : Connect manually


Connectivity settings

---------------------

    Number of SSIDs        : 1

    SSID name              : "Qdoba Free Wifi"

    Network type           : Infrastructure

For each of these profiles, check if they are secure or open, and if you are connecting manually or automatically. Then, if you wanted, you could netsh wlan delete profile name="Qdoba Free Wifi" and remove a profile, even when it's not near you.

In my recent podcast with security expert Troy Hunt, he pointed out that it's easy to create a fake honeypot Wi-Fi spot that has the same name as a common open network, like Starbucks, for example.

Given: If my PC or phone is set up to automatically connect to any open hotspot named "Starbucks" then it will just connect to one...even an evil hotspot.
Therefore: it would be nice to automatically delete profiles for Wi-Fi spots that are both open (no security) and set to automatically connect.

I was tired, so I thought I'd bang out a little utility to do this. I could have used PowerShell or something but I felt like using C#. It's exercise.

UPDATE: Lee Holmes went and wrote it in PowerShell! Amazing.

Wifi.exe and it's Usage

Tired of reading? There's a build zipped up of Wifi.exe available here and the source is on GitHub. You may need to Right Click | Properties | Unblock the zip.

There's no warranty. The code sucks and I'm a horrible person and you're running a util you found on my blog. However, it works awesome on my machine. Issues appreciated, tidy PRs appreciated more, running Resharper and doing a PR, less so. I'll update the build if good bugs require it.

If you run Wifi.exe (I put it in my path) you'll see something like this:

c:\>wifi

AP-guest                 manual    WPA2PSK

HANSELMAN-N                auto    WPA2PSK

HANSELMAN                  auto    WPA2PSK

HanselSpot                 auto    WPA2PSK

Qdoba Free Wifi          manual       open

Wayport_Access             auto       open Warning: AUTO connect to OPEN WiFi


Delete WiFi profiles that are OPEN *and* AUTO connect? [y/n]

n

Notice the columns, and the prompt. There's a warning when a hotspot is both open and set to auto-connect. If you answer Y to the prompt, the utility will delete that profile. You can also type 'wifi /deleteautoopen' to bypass the prompt and auto-delete just profiles that are auto and open.

A pull request a few minutes after I pushed this code also added the ability to

wifi delete "HOTSPOTNAME"

which is nice also. Thanks!

The Code

One of the great things about writing command line apps like this is that there's literally a dozen ways to do everything. They are trivial and silly but also useful and used daily. In this case I've got command line argument processing to think about, parsing output from a spawned process, doing the parsing in a clean way, making sure it works on a non-English machine (which I thought about but didn't test), as well as cleaning up of generated temp files.

It's hardly impressive code, but some of it was fun or interesting. Here's a few bits I liked.

Making Columns with Console.WriteLine and String.Format

Did you know that you can right- and left-align columns within a fixed with using String.Format? Few people know about this and I've seen whole libraries written custom to do the work that's built right in.

Console.WriteLine(String.Format("{0,-20} {1,10} {2,10} {3,30} ", a.Name, a.ConnectionMode, a.Authentication, warning));

Note the {0,-20} (left aligned) and the {1,10} (right aligned). Those are just like {0} and {1} in a String.Format but they include alignment and width.

Gratuitous use of Linq

It wouldn't be a silly utility without in crazy LINQ, eh? Who needs Regular Expressions when you can when you can do a SQL query over your string? ;) Actually, I don't know if this is a good thing or not. It was fun, though, and it works. Your thoughts?

This takes the output from wlan show profiles (seen above) and parses it into a list of just the AP Names. I think it should work in any language, assuming the : colons are there.

string result = ExecuteNetSh("wlan show profiles");

var listOfProfiles = from line in result.Split(new string[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries)

                     where line.Contains(":")

                     let l = line

                     where l.Last() != ':'

                     select l.Split(':')[1].Trim();


foreach (string profile in listOfProfiles)

    ExecuteNetSh(String.Format("wlan export profile \"{0}\" folder=\"{1}\"", profile, Environment.CurrentDirectory));

Cleaning up the temp XML files

I export a bunch of very specific XML files with a VERY non-specific extension. I can't control their file name and I don't want guess what their name is because I would need to recreate their AP Name encoding scheme. Instead, I look for any XML files in the current folder (given the rare chance that YOU, the utility runner, have XML files in the same folder already) and only delete the ones with the namespace that I know to be present in Wi-Fi profiles. I patted myself on the back for this one, but just lightly.

static XNamespace ns = "http://www.microsoft.com/networking/WLAN/profile/v1";


//Delete the exported profiles we made, making sure they are what we think they are! 

foreach (string file in Directory.EnumerateFiles(Environment.CurrentDirectory, "*.xml"))

    if (XElement.Load(file).Name.Namespace == ns)

        File.Delete(file);

Capturing Command Line Output

Finally, here's how you get the output of a command line process you started:

Process p = new Process();

p.StartInfo.FileName = "netsh.exe";

p.StartInfo.Arguments = arguments ?? String.Empty;

p.StartInfo.UseShellExecute = false;

p.StartInfo.RedirectStandardOutput = true;

p.Start();


string output = p.StandardOutput.ReadToEnd();

return output;

Pretty basic, but useful to bookmark.

Alternatives

After I wrote this I noticed there are some WinForms utilities to do this. That's great. I wouldn't mind making may own, except I'd want it to look exactly like the Windows 7 dialog. It'd be fun just to see if I could get it pixel perfect.

Feel free to go check out the code, play with it and make fun of me. https://github.com/shanselman/Windows-Wifi-Manager

Get Involved! Check out my latest production with TekPub. A meticulously edited TWO HOURS of video content where we cover everything we think a developer should know to "Get Involved" in the developer community.

Craig.talbert, Vitormbarbosa likes this

06 Jun 23:15

Scaling the Database: Why You Shouldn’t Cluster on GUIDs

by Michelle Ufford

In my last post, “Scaling the Database: Data Types,” I discussed the performance impact of data types and the repercussions of choosing the wrong data type. I also said, ... You’ll notice that I actually added a new column, myNewID, to the table. This adds 4 bytes of space but serves a very valuable tuning ...

06 Jun 23:09

Answering the “Twitter How-to” Question

by mjb

Twitter by design is simple: 140 characters or less.

By practice, on the other hand, Twitter is a tricky amount of syntax and social expectations vastly different than the other social networking giants.

I haven’t found a one-stop shop of How-to just yet, but there are a number of good sources all over the place. It will live on in a new page on this blog here.

The Syntax: Introduction

The Syntax: Beyond the Basics

Some Documentation:

http://mashable.com/guidebook/twitter/
http://personalweb.about.com/od/howtotwitter/How-To-Twitter.htm

UNSPOKEN RULES:

Be a whole person - you are never just representing your company or just yourself. Share who you are, professionally, but with your on personal touch.
Don’t be an egg – update your picture as soon as you make your account. Most people prefer pictures of you to put a face with the name.
Follow people if you like their content / want to get to know them in real life.
Retweet content (using the RT button) to show your interest. retweet with commentary if you want to add to the conversation.
Humor is your best asset on Twitter. Show your personality through wit and laughter (lol’s and “hahahaha”).
NOTE: some people let out their offensive humor, some people avoid it. Whatever you do, choose consciously whether you want to cross that line.
Be kind, thankful, congratulatory and respectful for the best results on interactions.

PRO TIPS:

If you start a tweet with an ‘@,’ it will only show up in timelines of people that follow both you and that person. To get the broadest reach, put a period (‘.’) in front of your tweets to people.
Avoid tweeting in bursts (over 2-3 broadcasting tweets in a row). To prevent bursts, use tools like Buffer or Hootsuite to schedule out your tweets.
Avoid hashtag use that isn’t either (a) Funny, like #WillTweetForBacon (b) Useful for having a conversation, like #EMCWorld for the conference or #InfoSec to talk with security folks.

30 May 23:36

Reflections on SQLBits XI

by jamiet

Its been a couple of weeks since SQLBits XI happened in Nottingham and I thought I’d jot down a few thoughts for posterity.

First the venue. I think its fair to say that the overall consensus was that the East Midlands Conference Centre was the best SQLBits venue these has been so far – I’ve been to more than half of them and based on my experience I would agree with that sentiment. The hotel especially was top quality – I was pleased with my room and the breakfasts were way better than the Travelodge standards I’ve become accustomed to at such events. Perhaps the great weather over the weekend helped lift the spirits but I’d say the bar has been set high, I hope future SQLBits conferences are at similar standard venues.

You may have noticed that many of my blog posts over the past year have been related to SSDT and my SQLBits sessions this time around followed that trend. I delivered a session on the Friday, jointly with Craig Ottley-Thistlethwaite, entitled “Real World SDDT” and also delivered my first ever day-long pre-conference seminar on the Thursday entitled “SSDT Database Projects from the ground up”. Delivering a pre-con was slightly petrifying and I spent a large part of the three months previous preparing for it; I hope it was worth it. I had 30 attendees which I was delighted with (especially given this was my first pre-con) and we established that the furthest anyone travelled was from Romania, though I’ll assume it wasn’t just so he could come to my pre-con Smile . I had some pleasant feedback via Twitter afterwards from some of the attendees and I’m crossing my fingers that the official feedback is in a similar vein.

The session I did with Craig went fairly well I thought. I did the first 30minutes where I covered the basics of SSDT deployment from a high level before Craig ratcheted the complexity up a few notches by demoing some interaction between SSDT, MSBuild & Git – really great stuff for those who like to get into the nuts and bolts of this stuff. I was delighted that Craig was willing to do the session with me (in fact it was his idea) as this was his debut public speaking gig and I’m hoping its given him the desire to do more in the future.

On the Saturday my wife’s uncle, John Milne, came to the conference. John has been working in customer service for years but of late has decided that he wants a change of career and to that end has been studying an Open University course in IT. He told me he had particularly enjoyed the database-focused modules of his course and hence I suggested he come to the free Saturday of SQLBits to try and get a flavour of what the industry is all about and perhaps learn about some real-world experiences to add to his academic travails. By the end of the day John told me he’s had a fantastic time, learned a lot, and was hooked. Mission accomplished I’d say. John lives in Leeds so I introduced him to Chris Testa-O’Neill who helps to run the Leeds user group and John should be going along to some user group events in the near future – if you happen to meet him there please welcome him into the fold!

All that remains for me to say is a massive thank you to the SQLBits committee who do such an amazing job, all voluntarily, in putting this all together. Thank you Simon Sabin, James Rowland-Jones, Chris Webb, Darren Green, Allan Mitchell, Tim Kent, Chris Testa-O’Neill & Martin Bell. I also want to thank all of the volunteer SQLBits helpers that worked tirelessly on the weekend to make sure the whole thing ran smoothly.

Bring on SQLBits XII!

@Jamiet

30 May 23:35

Another argument for stored procedures

by Aaron Bertrand

Aaron Bertrand (@AaronBertrand) talks about a subtle way that ad hoc queries can interfere with SQL Server performance by taking up more space in the plan cache than they really need.

30 May 23:35

sp_spaceused Alternative

by Andrew Kelly

I don’t know why but for some reason I have never liked using sp_spaceused. It probably started a long time ago when the results were often inaccurate due to the meta data being out of date. I am pretty sure that was fixed somewhere along the line but the system stored procedure had some other limitations that usually prevented me from seeing the data that I really wanted in most cases. So at some point I created my own stored procedure to view the data I was most interested in almost on a daily...(read more)

View attached file (2.23 KB, application/x-zip-compressed)

30 May 23:34

Oracle to SQL Server Replication-Design considerations

by msrviking

I am thinking I have covered all and best of the points for one to configure a transactional replication between Oracle and SQL Server in all my previous posts. In this post I would like to give my 2 cents on design considerations and limitations of Oracle publishers that I had to keep in mind during this exercise. There is a neat list of other points published in this Technet article.

Configuration of Oracle Publisher with options

Per the above article I learnt that the Oracle publishing can be done using the Oracle Gateway and Oracle Complete options. Few things that make difference these two options is that Oracle Gateway provides better performance but doesn’t let multi-transactional publications of a single to-be or published table. Whereas in the case of Oracle Complete option the table can be published in several publications, but lesser performance.

I tried digging in Google to see on how these two options work as described, because I didn’t try on the setup I built to find the difference. I didn’t find any reliable sources that talk about difference in detail but one of the posts by Richard Douglas (Editor in Chief of SQL ServerPedia) gives a short description “Gateway is for high performance replication solutions. Complete is when you want to transform your data or to do row filtering”. I might have to try and see what it means in here.

I remember seeing row filtering in a transaction replication using the option of Oracle Gateway, and I am yet to find out on how this is different when Oracle Complete is chosen. For now I shall park for future tries on this limitation and share my findings.
Replication supports few objects and not all on Oracle

It’s quite obvious with this point (supports few objects) because the similarities that Oracle and SQL Server have at object level implementation are few. So,

What are the objects that are supported?
1. Tables
2. Index-organized tables
3. Indexes
4. Materialized views (as tables)
What are those objects that cannot be replicated, but can exist on published tables?
1. Domain-based indexes
2. Function-based indexes
3. Defaults
4. Check constraints
5. Foreign keys
6. Storage options (tablespaces, clusters, etc.)
What are the objects that cannot be replicated at all?
1. Nested tables
2. Views
3. Packages, package bodies, procedures, and triggers
4. Queues
5. Sequences
6. Synonyms
Okay with this list we know what can be replicated, what cannot be replicated? I picked the list from the above post, but was hunting to find out why we can’t replicate these objects. One simple reason that keeps coming to my mind is that “some of the above objects in the ‘cannot’ list don’t have equivalents in SQL Server”. Okay, so aren’t there workarounds for these objects? For example Packages have bundle of programmed objects like packaged variables, cursors and few others which have to be converted through a specific process, and this specific process can’t be through Replication. That exercise could be migration of objects from Oracle to SQL Server, and surely can’t be part of Replication topology. Does this apply for simple objects also? I don’t have any technical reason to explain for this too so concluding it with my 2 cents. Meanwhile I shall see if I can find better “reasons”.

Here is a link from Microsoft on what it all means to convert some of the above listed objects. This was one of the reasons that I helped me to conclude logically that this is part of migration from Oracle to SQL Server and can’t be part of Replication at a conceptual or architectural level.

Since my setup was to test the new Oracle Publisher I didn’t have to keep tab on other design considerations, but I am sure I will have to deal with others too when this gets on to full deployment mode than a POC.

I hope this short post helps you folks out there, and happy reading.

The usual ask – please feel free to share your thoughts or comments.

Cheers!

30 May 23:34

Querying the VMware vCenter Database (VCDB) for Performance and Configuration Information

In my previous post I showed how DBAs can be provided with read-only access to Virtual Center for monitoring VM performance in the data-center One thing that many administrators don’t realize or think about is that Virtual Center uses a database for storing information about the virtual data center and the most common database platform used for this purpose is SQL Server. Virtual Center can use an existing SQL Server in the environment, or it can install SQL Server Express during setup.

As a consultant, I do a lot of reviews of SQL Servers running on VMware, and as part of that work I have to understand the underlying virtualization configuration – not just for SQL Server, but for the virtualized data center. This includes needing information about the host configuration, other VMs running on the host, and performance metrics from the VM historical data that is stored in the VIM_VCDB database used by vCenter.

The VIM_VCDB database isn’t entirely documented, but a list of the views and their definitions can be found in the Using VirtualCenter Database Views technical note from VMware. Using this information, and additional information collected using a server-side trace against the database while working in vCenter, I’ve compiled a series of Transact-SQL scripts for pulling information from the VIM_VCDB database in an environment.

The first time I ever demonstrated these scripts was last week during the Virtualization module of our IE3: High Availability and Disaster Recovery course in Chicago. I’m also using these scripts in the Pluralsight course on SQL Server Virtualization that I am currently recording. When I demonstrated these scripts last week, the first question the class asked was where can they get them. We always provide the scripts from our demos to students, but I also promised to blog them as well this week.

Host Configuration

The VPXV_HOSTS view provides host level information and I commonly JOIN that view to the VPXV_VMS view to aggregate VM information per host as a part of a environment health check.

-- Host Configuration
SELECT
    vh.NAME AS HOST_NAME,
    HOST_MODEL,
    CPU_MODEL,
    CPU_COUNT,
    CPU_CORE_COUNT,
    CPU_HZ,
    CPU_THREAD_COUNT,
    SUM(CASE WHEN vm.POWER_STATE = N'On' THEN vm.NUM_VCPU ELSE 0 END) AS VM_VCPU_ACTIVE,
    MEM_SIZE,
    SUM(CASE WHEN vm.POWER_STATE = N'On' THEN vm.NUM_VCPU ELSE 0 END)*1./CPU_THREAD_COUNT AS THREAD_OVERCommit,
    SUM(CASE WHEN vm.POWER_STATE = N'On' THEN vm.NUM_VCPU ELSE 0 END)*1./CPU_CORE_COUNT AS CORE_OVERCommit,
    CAST(MEM_SIZE AS BIGINT)/1024/1024 AS MEM_SIZE_MB,
    SUM(CASE WHEN vm.POWER_STATE = N'On' THEN vm.MEM_SIZE_MB ELSE 0 END) AS VM_MEM_SIZE_MB,
    SUM(CASE WHEN vm.POWER_STATE = N'On' THEN vm.MEM_SIZE_MB ELSE 0 END)*1./(CAST(MEM_SIZE AS BIGINT)/1024/1024) AS MEM_OVERCommit,
    SUM(CAST(vm.MEMORY_OVERHEAD AS BIGINT)) AS VM_MEMORY_OVERHEAD,
    SUM(vm.MEM_SIZE_MB) AS VM_MEM_SIZE_MB_POTENTIAL,
    SUM(vm.NUM_VCPU) AS VM_VCPU_ALLOC_POTENTIAL,
    NIC_COUNT,
    HBA_COUNT,
    SUM(CASE WHEN vm.VMMWARE_TOOL = N'OK' THEN 1 ELSE 0 END) AS VM_TOOLS_OK,
    SUM(CASE WHEN vm.VMMWARE_TOOL = N'Old' THEN 1 ELSE 0 END) AS VM_TOOLS_OUT_OF_DATE,
    SUM(vm.NUM_VCPU) AS VM_VCPU_ALLOC
FROM dbo.VPXV_HOSTS AS vh
INNER JOIN dbo.VPXV_VMS AS vm
    ON vh.HOSTID = vm.HOSTID
GROUP BY vh.NAME, HOST_MODEL, CPU_MODEL, CPU_COUNT, CPU_CORE_COUNT, CPU_HZ,
    CPU_THREAD_COUNT, MEM_SIZE, NIC_COUNT, HBA_COUNT;

Performance Counter Information

Performance counters are maintained historically in the VIM_VCDB database at multiple roll-up levels. The VPXV_HIST_STAT_DAILY view has the daily roll-up values for each of the VMs and the VPXV_HIST_STAT_WEEKLY view has the weekly roll-up values.

Querying individual counter values and making decisions is not appropriate for all counters. An example of this is the CPU Ready counter which has to be calculated from the roll-up summation to determine if a problem actually exists (see CPU Ready Time in VMware and How to Interpret its Real Meaning). Two counters that I look at for the virtual data center and all of the VMs are CPU Ready and Memory Ballooned (vmmemctl).

CPU Ready Values

-- Daily %RDY values
SELECT
    vh.NAME AS HostName,
    vv.NAME AS GuestName,
    SAMPLE_TIME,
    SAMPLE_INTERVAL,
    (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 AS READY_PERCENT,
    CASE
        WHEN (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 > 5
                AND (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 < 10
            THEN N'WARN'
        WHEN (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 > 10 THEN N'RED'
        ELSE N'OK'
    END AS CPURDY_State,
    STAT_VALUE AS CPUReady_Summation,
    NUM_VCPU
FROM dbo.VPXV_HIST_STAT_DAILY AS vhsd
INNER JOIN dbo.VPXV_VMS AS vv
    ON vhsd.ENTITY = N'vm-'+CAST(vv.VMID AS NVARCHAR)
INNER JOIN  dbo.VPXV_HOSTS AS vh
    ON vv.HOSTID = vh.HOSTID
WHERE STAT_GROUP = N'cpu'
  AND STAT_NAME = N'ready'
  AND CASE
        WHEN (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 > 5
            AND (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 < 10
            THEN N'WARN'
        WHEN (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 > 10 THEN N'RED'
        ELSE N'OK'
        END <> N'OK'
ORDER BY CPURDY_State, READY_PERCENT DESC;

-- Weekly %RDY values
SELECT
    vh.NAME AS HostName,
    vv.NAME AS GuestName,
    SAMPLE_TIME,
    SAMPLE_INTERVAL,
    (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 AS READY_PERCENT,
    CASE
        WHEN (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 > 5
                AND (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 < 10
            THEN N'WARN'
        WHEN (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 > 10 THEN N'RED'
        ELSE N'OK'
    END AS CPURDY_State,
    STAT_VALUE AS CPUReady_Summation,
    NUM_VCPU
FROM dbo.VPXV_HIST_STAT_WEEKLY AS vhsd
INNER JOIN dbo.VPXV_VMS AS vv
    ON vhsd.ENTITY = N'vm-'+CAST(vv.VMID AS NVARCHAR)
INNER JOIN  dbo.VPXV_HOSTS AS vh
    ON vv.HOSTID = vh.HOSTID
WHERE STAT_GROUP = N'cpu'
  AND STAT_NAME = N'ready'
  AND CASE
        WHEN (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 > 5
            AND (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 < 10
            THEN N'WARN'
        WHEN (STAT_VALUE/ (vv.NUM_VCPU * SAMPLE_INTERVAL * 1000)) * 100 > 10 THEN N'RED'
        ELSE N'OK'
        END <> N'OK'
ORDER BY CPURDY_State, READY_PERCENT DESC;

Memory Ballooning Counters

-- Weekly Memory Ballooned
SELECT
    vh.NAME AS HostName,
    vv.NAME AS VMName,
    Start,
    Finish,
    tab.SAMPLE_INTERVAL,
    MIN(STAT_VALUE)/1024. AS MinBallooned_MB,
    MAX(STAT_VALUE)/1024. AS MaxBallooned_MB,
    AVG(STAT_VALUE)/1024. AS AvgBallooned_MB,
    COUNT(*) * (tab.SAMPLE_INTERVAL/60) AS MinutesBallooned
FROM dbo.VPXV_HIST_STAT_WEEKLY AS vhsd
INNER JOIN dbo.VPXV_VMS AS vv
    ON vhsd.ENTITY = N'vm-'+CAST(vv.VMID AS NVARCHAR)
INNER JOIN  dbo.VPXV_HOSTS AS vh
    ON vv.HOSTID = vh.HOSTID
CROSS JOIN (SELECT
                MIN(SAMPLE_TIME) AS Start,
                MAX(SAMPLE_TIME) AS Finish,
                SAMPLE_INTERVAL
            FROM dbo.VPXV_HIST_STAT_WEEKLY
            WHERE STAT_NAME = N'vmmemctl'
              AND STAT_VALUE > 0
            GROUP BY SAMPLE_INTERVAL) AS tab
WHERE STAT_NAME = N'vmmemctl'
  AND STAT_VALUE > 0
GROUP BY vh.Name, vv.Name, Start, Finish, tab.SAMPLE_INTERVAL
ORDER BY HostName, MinutesBallooned DESC;

-- Daily Memory Ballooned
SELECT
    vh.NAME AS HostName,
    vv.NAME AS VMName,
    Start,
    Finish,
    tab.SAMPLE_INTERVAL,
    MAX(STAT_VALUE)/1024. AS MaxBallooned_MB,
    AVG(STAT_VALUE)/1024. AS AvgBallooned_MB,
    COUNT(*) * (tab.SAMPLE_INTERVAL/60) AS MinutesBallooned
FROM dbo.VPXV_HIST_STAT_DAILY AS vhsd
INNER JOIN dbo.VPXV_VMS AS vv
    ON vhsd.ENTITY = N'vm-'+CAST(vv.VMID AS NVARCHAR)
INNER JOIN  dbo.VPXV_HOSTS AS vh
    ON vv.HOSTID = vh.HOSTID
CROSS JOIN (SELECT
                MIN(SAMPLE_TIME) AS Start,
                MAX(SAMPLE_TIME) AS Finish,
                SAMPLE_INTERVAL
            FROM dbo.VPXV_HIST_STAT_DAILY
            WHERE STAT_NAME = N'vmmemctl'
                AND STAT_VALUE > 0
            GROUP BY SAMPLE_INTERVAL) AS tab
WHERE STAT_NAME = N'vmmemctl'
  AND STAT_VALUE > 0
GROUP BY vh.Name, vv.Name, Start, Finish, tab.SAMPLE_INTERVAL
ORDER BY HostName, MinutesBallooned DESC;

Summary

The VIM_VCDB contains a lot more information than I’ve covered in this post. These are simply examples of the types of information that you can query from the database as a part of troubleshooting problems or evaluating the configuration of a virtual data center. One of the important things about having access to this data is to always validate that you are applying the appropriate understanding to the values being stored in the database.

The post Querying the VMware vCenter Database (VCDB) for Performance and Configuration Information appeared first on Jonathan Kehayias.

30 May 23:33

Using CROSS APPLY instead of a calculated variable

Earlier this week I posted The many uses of CROSS APPLY and I’m quite glad I did. I’ve been working on a series of audit queries for the last couple of weeks and got thrown a bit of a curve today. The original specs included several comparisons against a date. Basically the following.

WHERE (date1 >= '5/31/2008' AND date2 >= '5/31/2008')
	OR (date3 >= '5/31/2008' AND date4 >= '5/31/2008')

I was a little shocked when I was told several weeks into the project “Next month it will change to June 30 2008.” So in reality rather than a constant (used no less than 7 times throughout each of 13 views) I now needed a calculation.

Not really a huge issue. It’s a simple enough calculation and it would be simple enough to just copy and paste it everywhere it’s needed. I just really don’t want the repetitive code. Not to mention adding that much more complexity to the views. And I shudder to think of having to change it in over 90 different places if the business specs change. So the next thought I had was using a variable, unfortunately these have to be views, although I guess I could have used TVFs (table valued functions) but still not the best solution.

Now recently I watched Kendra Little’s 5 T-SQL Features You’re Missing Out On and learned that you could use CROSS APPLY for reusable calculations. Well that’s what this is right? A reusable calculation.

So now instead of looking like this: (The calculation is from one of Pinal Dave’s blogs, and the calculation is for September 2003. If you are reading this past May 2013 that will have changed.)

SELECT *
FROM Sales.SalesOrderHeader
WHERE (DueDate >= DATEADD(mm, DATEDIFF(mm,0,GETDATE())-116,0) 
			AND DueDate < DATEADD(mm, DATEDIFF(mm,0,GETDATE())-115,0))
   OR (ShipDate >= DATEADD(mm, DATEDIFF(mm,0,GETDATE())-116,0) 
			AND ShipDate < DATEADD(mm, DATEDIFF(mm,0,GETDATE())-115,0))

It now instead looks like this.

SELECT *
FROM Sales.SalesOrderHeader
CROSS APPLY (SELECT DATEADD(mm, DATEDIFF(mm,0,GETDATE())-116,0) AS StartDate, 
			DATEADD(mm, DATEDIFF(mm,0,GETDATE())-115,0) AS EndDate) AS Vars
WHERE (DueDate >= Vars.StartDate AND DueDate < Vars.EndDate)
   OR (ShipDate >= Vars.StartDate AND ShipDate < Vars.EndDate

Personally I find this much easier to read. Of course if the performance isn’t there then it doesn’t really matter. Fortunately or unfortunately depending on how you look at it, all of my tests so far have been inconclusive. The query plans are very slightly different on some queries and the same on others. The run times are only slightly different and neither one runs faster every time. Until I see something conclusive one way or the other I’ll consider performance close enough with a big edge on readability and maintainability going to the CROSS APPLY.

Filed under: Microsoft SQL Server, SQLServerPedia Syndication, T-SQL Tagged: code language, CROSS APPLY, language sql, microsoft sql server, sql statements, T-SQL

30 May 23:33

Security Questions: Grant Permissions to All Stored Procedures

As I mentioned in the introductory post, in the Introduction to SQL Server Security session for Pragmatic Work’s Training on the T’s, I received a large number of questions that there wasn’t time to answer. Instead of just a re-cap of all of the questions, instead I’ve opted to put together a post per topic. Hopefully, this will help, not just those that attended the session, but also anyone searching for the same questions later on.

StoreD Procedure question

The second question in the list is:

Do we have easy way to grant all stored procedures execution in a single shot?

When I was a DBA working on SQL Server 7 and SQL Server 2000, this was a near constant pain. I wanted to give users the ability to only execute stored procedures and wanted the least amount of hassle while doing it. Back then, that meant a script that would run nightly and verify that all stored procedures were included in the database role for executing stored procedures. Needless to say, while automated, the delay between deploying the procedure and granting access often got the better of me and we really needed something different.

In steps SQL Server 2005. At this point, you might be thinking, “are you going to tell me that I’ve been able to do this since SQL Server 2005?” Yes, since SQL Server 2005, there has been a permission that allows execution for all stored procedures to be assigned to any principal.

Exec All Stored Procedure Demo

Granting the permission to execute all stored procedures is rather simple. It just requires knowing the user (or role) that needs the permissions and a grant statement, shown in listing 1.


--Listing 1. Grant exec to all stored procedures in a database

GRANT EXEC TO [User]
GO

Of course, the question that immediately follows the permissions to grant all stored procedures is one where the users need access to all of them, except these other ones. There are a few ways this can be approached. One useful manner to control the permissions is to scope the permissions to a specific schema. The GRANT option, shown in listing 2, can allow the user to only execute stored procedures in the schema. Or the user could be granted permissions for all schemas, as was done previously, and then the schema that shouldn’t be accessed has DENY permissions set.


--Listing 2. Permissions based on executing stored procedures in a schema

GRANT EXEC ON SCHEMA::SomeSchema TO [User]

DENY EXEC ON SCHEMA::AnotherSchema TO [User]
GO

There are numerous ways to achieve this, but I’ve always found it useful to know that you can scope permissions, such as EXEC, to the schema.

Summary

If you weren’t aware of the flexibility in assigning broad permissions to stored procedures, this post has hopefully enlightened you to some new security options. Are there any permission combinations that you’ve tried to accomplish that haven’t worked out right? Or, have you come up with an interesting permission solution that you think others might find useful? Leave a comment and let us in on your solutions or problems.

You just finished reading Security Questions: Grant Permissions to All Stored Procedures! Consider leaving a comment!

Keep up to date with posts via RSS or on twitter at StrateSQL

30 May 23:32

Cross-Post: Microsoft announces major expansion of Windows Azure services in Asia

by Craig Kitterman

Editor's Note: The following is a post from Satya Nadella, President of the Server & Tools Business at Microsoft.

As the worldwide demand for cloud computing continues to grow, so does Windows Azure. Microsoft is the only at-scale global public cloud provider to deliver a hybrid cloud advantage and we’re excited to announce plans to invest hundreds of millions of dollars to expand the Windows Azure footprint in Asia – specifically in China, Japan and Australia.

On Wednesday in Shanghai, Microsoft CEO Steve Ballmer announced that a Public Preview for the Windows Azure service, operated by 21Vianet in China, will be available for sign-up starting June 6. The announcement in China builds on the agreements we signed Nov. 1, 2012 with the Shanghai municipal government and 21Vianet. We are excited to be the first multinational organization to make public cloud services available in China, and encourage customers to sign up for the free trial at http://www.windowsazure.cn starting June 6.

For additional details on the expansion, please read my post on the Official Microsoft Blog.

Satya

30 May 23:32

The Journey is the Reward: Speaking for SQLPass, where do you start?

I'm heartened by the numbers of people who are expressing an interest in speaking at events. In particular, I understand that SQLPass had record numbers of people submitting to SQLPass Summit in Charlotte this year. For me, this is especially important because it suggest that more people are interested in speaking and sharing their SQL Server knowledge. It also shows that the community are supportive of a SQLPass Summit event in Charlotte. This energy shows the strength of the event itself, rather than being tied to a particular location i.e. Seattle. As much as I love Seattle, I have never been to Charlotte and I'm excited to be visiting a new part of the United States. In case you are wondering, I am Scottish :)

I am not part of the session selection process for SQLPass so this blog is based on my insights from the community only. How do you get experience so that you can speak for events, perhaps with the goal of speaking at the SQLPass Summit event, which is the pinnacle of every SQL Server guru's career?

It can seem to be a vicious cycle: you can't speak because you don't have experience, and you don't get experience because you don't speak.

This isn't true!

There are plenty of opportunities to speak and here are some:

Come and join me and give a webinar for the Data Warehousing and Business Intelligence Virtual Chapter. This is a great way to start and you will have the full support of myself and Julie Koesmarno, who helps with the Chapter as I do. We can help you to get started. The sessions last for about one hour, and we do 45 minutes talk, and then 15 minutes of questions. Why not give it a go?

Your local PASS Chapter - come and attend your local user Group, and see how you can get involved. Take a look at the PASS site for some thoughts

Your workplace - why not try a 'brown bag lunch' session, whereby you and your team all take turns to present on something? It is a safe environment to learn, and you might even enjoy it!

If you didn't get a slot at SQLPass Summit this time - don't be disheartened. There is a Chinese proverb: The journey is the reward. Keep going, and who knows what will happen? You'll learn, and make friends as you go along. Submit again next year with more experience under your belt, and more knowledge of your subject. Good luck!

A special note for UG leaders - make sure and support your new speakers. Mentor and encourage them. That person might be the next Brent Ozar or Kalen Delaney, so give them the opportunity to grow. We are all part of a community and we can help one another!

Good luck everyone! Please email me at jen.stirrup@copperblueconsulting.com if you'd like to give a webinar. I'd love to schedule you in, and help you get started!

Best,
Jen x

Mrdenny

Shared posts

1. My Switch Is Bored

2. Invisible Plan Operators

The problem

Explanation

Cost-based optimization

New Cardinality Estimates

Performance affecting plan operators

(Table-Valued Function)

/ (RID Lookup / Key Lookup)

/ (Table Scan / Clustered Index Scan)

(Select)

Wrap-up

Start with the Data

Next Is Data Management

Move to Data Transport

Finally, Data Processing

Sure, We Need A Front-End At Some Point

References

1. State the problem succinctly in the title

2. Explain the Error Completely

3. Explain what you have done to research the problem

4. Say "Please" and "Thank You"

TL;DR Version

Caveats and "Ya I know."

Wifi.exe and it's Usage

The Code

Making Columns with Console.WriteLine and String.Format

Gratuitous use of Linq

Cleaning up the temp XML files

Capturing Command Line Output

Alternatives

Host Configuration

Performance Counter Information

CPU Ready Values

Memory Ballooning Counters

Summary

StoreD Procedure question

Exec All Stored Procedure Demo

Summary