Shared posts

20 Apr 19:37

My First…

by Chris Lumnah
My Hello World Post
12 Nov 05:06

Bye Bye, Bullets: The Stack Overflow Developer Story is the New Technical Resume

by Jay Hanlon

If you write code, you know that you’re more than a list of places where you worked or went to school. However you got to where you are now, what should matter is what you’ve built, and what you can do. Whether you're currently looking for new opportunities or not — and whether you're active on Stack Overflow or not — your Developer Story is the best way to share whatever it is that you take pride in.

It’s your story; tell it your way.

Get yours today at s.tk/story. It only takes a few minutes, and you do NOT want some other joker snapping up the good URLs. Take it from me, stackoverflow/story/J@yH@n10n.

Technology has evolved. Tech resumes? Not so much.

The resume was invented by Da Vinci in the 15th century. It mostly served as a letter of introduction for traveling lords… and it hasn’t changed much since. What do you see when you scan the bold stuff on a resume? Employers, job titles, schools, and degrees. And a lot of small bullets. Plus maybe an other-stuff-intended-to-round-me-out-as-an-actual-human section at the bottom.

In the roughly five centuries since resumes were created to help nobility vouch for roadbound gentry, they’ve stayed mostly optimized for one thing: conveying the importance of your pedigree.

The emphasis is all on the seniority of your titles and how impressive your companies or schools have been. Which is a great way for some developers to put their best foot forward. Have a Masters in CS from Yalemouth? Cool! That’s one good signal. But it ain't the only one. Heck, even politicians know that fancy schools are only one of many ways to signal potential:

“It turns out it doesn’t matter where you learned code, it just matters how good you are at writing code.” — President Barack Obama

Show, don’t just tell.

Developers, fundamentally, are makers. Like designers… or architects… or jugglers! You wouldn’t hire any of them based on a list of titles or places they’d worked. ("Oh, a Senior Associate Juggler? Get her! I bet she can do those flaming sticks and stuff!") So why hire a developer that way? Makers' skills are conveyed by showing, not telling. Portfolios. Blueprints. This awesome juggling video.

Be more than bullets: Your best work, front and center.

The Developer Story lets you share what you've worked on by linking to actual features you’ve worked on, blog posts, or public code. Those things shouldn’t be described in the second clause of a tiny bullet, or relegated to the “Other interests” section of a resume, to sit unassumingly next to your second-place trophy in intramural tetherball.

The Developer Story puts the work you’re most proud of where it belongs: at the grownup table, right alongside your roles or schools.

Which of these better tells the real story?

No reputation points? No problem. Posting answers or contributing to Documentation is one way to build up public artifacts of your coding experience, but it’s not the only way. Your Developer Story lets you show off whatever you work on. So you’ll look good, no matter how much rep you have on Stack Overflow.

Find a job you'll love, on your terms.

If you’re ready for your next challenge, the Developer Story makes it easy to learn about jobs that fit your personal criteria. And since developers are in demand, we know it’s not about finding a job — it’s about finding the right job. The same way Stack Overflow Q&A put the right answer right on top, when you fill out your profile, Stack Overflow Jobs shows you roles that match the technology or projects you want to work on. (Even if they’re not what you’re using today.) And we always puts the developer in control. No spam, no BS. (Did you know that Stack Overflow Jobs actually penalizes blind messaging by employers?)

And it's 100% backwards-compatible with those crusty old resumes - it can still highlight fancy schools and fancy titles, and anything else you'd include on a traditional resume. So if you're like me, and still just a little proud that you got off the waitlist and eked your way into a school above your intellectual weight class, you can still show off your alma mater off with pride.

But don’t employers just want role, schools, and keywords? Nope. It turns out that they’ve just accepted that that’s all that they can generally get from a resume, leaving them stuck using “seven vs. eight years of JavaScript experience” to determine whom to interview. The employers we’ve shared it with love the way the Developer Story gives them real, tangible ways to understand what a candidate has actually worked on.

It's not just for developers who are job hunting.

Happy at your current gig, but open-minded? According to our 2016 developer survey, 63% of developers aren't actively looking, but are open to learning about new opportunities to level up. If you’re enjoying where you are, but open to finding that perfect new challenge, the Developer Story can help you keep an eye on what’s out there, and we’ll only send you the best opportunities that match your goals.

Completely not interested in jobs, but proud of what you’ve made? Listen, I get it. I never want to leave my job. The Developer Story is for you, too. Like most people who make things, devs often like to share what they’ve built. So it’s designed so you can use it as your own “coding central," even if you’re not job hunting. Just create a Developer Story, but set your job preference to “Not Interested.” You can share the features you’ve worked on, Github repos, blog posts, or even books you’d recommend. It’s your story. Tell it your way.

Once your Developer Story is beautiful, let’s work on your wardrobe.

We think one of the best ways to inspire other devs is by sharing some of the awesome stuff their peers are building. Or by showing a curious twelve year old what someone else who loves breaking machines can grow up to be.

So, for the next two weeks, if you tweet a link to your developer story, and include the #mydevstory hashtag, we’ll enter in you in a contest to win one of two hundred gen-u-ine Stack Overflow tees (50 mens and 50 women shirts each week.) And don’t forget the link to your Developer Story - we need it to contact you if you win.

Take a minute to start your story now.

(We hope you’ll indulge our use of Rear Admiral Hopper’s incredible biography as a case study for the images in this post. You should not take this to mean that the Rear Admiral endorses Stack Overflow in any way. You can, however, take it to mean that we endorse her. Wholeheartedly.)

Grace Hopper

Thanks to Abby T. Mars, Elaine Wang, Kaitlin Pike, Kit Carrau, Rachel Maleady, Taryn Pratt, and Tim Post for helping to improve this post.

03 Nov 16:33

Vblock and VxBlock use Cisco UCS. Got it?

by Chad Sakac
EMC logo

Dear reader – have you ever been misquoted?  Has communication in your org ever been imperfect?   I’m sure you have – and it can be frustrating.

There was a CRN article that went live today where it sounds like we have or will have VxBlocks that have non-UCS stuff in them.

NOPE.   The buck stops with me, and it’s a FIRM NO.

Dell EMC PowerEdge is an awesome rack mount platform, and FX2 is an awesome modular platform.   We are making PowerEdge a universal ingredient in our HCI portfolio, like NOW.

When customers buy CI like Vblock and VxBlock – they are getting the cake, not the ingredients (flour, eggs, sugar).   BUT – the ingredients create parameters of the system, and UCS B-Series blades make for great CI (which has an external storage array in it).   We are in NO rush to lower the competitiveness of our Vblock and VxBlock cake.

If you want to know what the difference is between a Vblock and VxBlock – it’s simple, and has NOTHING to do with the server.   Vblock uses the N1Kv virtual switch (and cannot have ACI down to the VM – you can have ACI in the fabric and to the host… also no NSX), VxBlock uses the VMware distributed virtual switch (VDS), which in turn means it can have NSX and ACI from us.    The Release Certification Matrix (RCM) for Vblock and VxBlock doesn’t include the Cisco AVS.  If you want to know more, read here.

Via services, we can convert Vblocks to VxBlocks.   The vast majority of what we ship in the “Block” family right now are actually VxBlocks.

Will we create bundles of Dell EMC Storage with Dell EMC PowerEdge – yes.  But that’s a bundle a reference architecture, not an engineered system.   No lifecycle, no sustained engineering.   People that don’t understand this key difference don’t understand that CI => (is greater than) a bundle.

There is NO plan to have VxBlocks with anything other than UCS, and that’s a fact.  Capiche?

03 Nov 16:33

Nomenclature Matters

by Martin Valencia
EMC logo

The word nomenclature comes from the latin word nomenclatura and is defined as the devising or choosing of names for things, especially in science or other discipline. Also know as a body or system of names in a particular field. You’re probably wondering what the heck i’m talking about, but bear with me. 

In the last ten years the IT industry has changed drastically, and the pace of innovation is only increasing. As professionals, we’ve always had to evolve to stay relevant in our field, even though in the past, the delta at which the evolution was necessary was much greater. I’ve tried to stay slightly ahead of the curve, at the very least, with the curve. If there was any one thing that I could point to that helped me through each evolution it would be the nomenclature. Learning and understanding new terms has been the single most successful thing that’s helped me when learning something new.

Think back to when you first heard about virtualization, there were a ton of new terms that you probably had never heard of; i certainty hadn’t.

  • hypervisor
  • virtual switch
  • vmkernel
  • HA
  • DRS

Virtualization has been around for a while now, most people are very familiar with those terms and how the underlying technology works. The pace of innovation has increased since the rise of virtualization which has brought us to cloud, containers, software defined networking, DevOps, IaaS, PaaS, OpenStack, hyperconverged…to name a few. New concepts to understand, and with it, more nomenclature:

  • endpoint groups
  • virtual private cloud
  • application centric infrastructure
  • microservices
  • service catalog
  • dockerfile
  • CI/CD
  • containers
  • security groups
  • redshift
  • distributed logical router
  • images
  • merge, pull, push
  • auto-scaling groups
  • multi-tenancy

Of course there are more, I could write pages full of new terms that have emerged in the past few years. OK, so you have some new terms, new nomenclature to learn and understand. On top of all that, companies decide they want to use overlapping terms that mean different things. The term container is a great example; container means five different things to five different companies.

I know this entire post is pretty obvious to most people; words are important. What I hope you will take away from this, is while things may be changing rapidly, at a vigorous pace, try not to be completely overwhelmed. When you sit down to read about a new technology or concept that is foreign to you, start with the nomenclature. In my experience it will give you the best chance for success for understanding and applying whatever it is you’re trying to learn.

03 Nov 16:33

Dell EMC Announces Isilon All-Flash

by Dan Frith
EMC logo

You get a flash, you get a flash, you all get a flash

Last week at Dell EMC World it was announced that the Isilon All-Flash NAS (formerly “Project Nitro“) offering was available for pre-order (and GA in early 2017). You can check out the specs here, but basically each chassis is comprised of 4 nodes in 4RU. Dell EMC says this provides “[e]xtreme density, modular and incredibly scalable all-flash tier” with the ability to have up to 100 systems with 400 nodes, storing 92.4PB of capacity, 25M IOPS and up to 1.5TB/s of total aggregate bandwidth—all within a single file system and single volume. All OneFS features are supported, and a OneFS update will be required to add these to existing clusters.

isilon_all-flash_001

[image via Dell EMC]

 

Why?

Dell EMC are saying this solution provides 6x greater IOPS per RU over existing Isilon nodes. It also helps in areas where Isilon hasn’t been as competitive, providing:

  • High throughput for large datasets of large files for parallel processing;
  • IOPS intensive: You can now work on billions of small files and large datasets for parallel processing;
  • Predictable latency and performance for mixed workloads; and
  • Improved cost of ownership, with higher density flash providing some level of relief in terms of infrastructure and energy efficiency.

 

Use Cases?

Dell EMC covered the usual suspects – but with greater performance:

  • Media and entertainment;
  • Life sciences;
  • Geoscience;
  • IoT; and
  • High Performance Computing.

 

Thoughts and Further Reading

If you followed along with the announcements from Dell EMC last week you would have noticed that there have been some incremental improvements in the current storage portfolio, but no drastic changes. While it might make for an exciting article when Dell EMC decide to kill off a product, these changes make a lot more sense (FluidFS for XtremIOenhanced support for Compellent, and the addition of a PowerEdge offering for VxRail). The addition of an all-flash offering for Isilon has been in the works for some time, and gives the platform a little extra boost in areas where it may have previously struggled. I’ve been a fan of the Isilon platform since I first heard about it, and while I don’t have details of pricing, if you’re already an Isilon shop the all-flash offering should make for interesting news.

Vipin V.K did a great write-up on the announcement that you can read here. The press release from Dell EMC can be found here. There’s also a decent overview from ESG here. Along with the above links to El Reg, there’s a nice article on Nitro here.

03 Nov 16:33

vSphere on AWS – A Deal with the Devil?

by Andrew Miller
EMC logo

Thanks to John Troyer‘s always excellent TechReckoning newsletter I ended up writing down some thoughts around vSphere on AWS – a topic I’ve been discussing and reading a lot about recently. While I can’t say for sure that what’s below is unique to me, I don’t think I’ve read this exact twist anywhere else. If you disagree, let me know in the comments — I won’t be offended. Let’s get to it…

Note: what’s below is mildly cleaned up from email back to John but not overly much. If anything isn’t clear, let me know – it was partially written in response to TechReckoning Dispatch, Vol. 3, Number 16, October 17, 2016.

Theory: vSphere on AWS is a deal with the devil. “Who is the devil?” is the main question.

  • Fact: if VMware could have put Amazon out of business, they’d have done it already. They’ve not been able to because Amazon offers quasi-decent Platform 2/2.5 capabilities but unmatched Platform 3 services (the competitive moat has alligators with rocket launchers on their backs) for “Cloud Native Applications” getting written today. Everyone believes applications are moving a direction that favors AWS – how fast is the open question (see the 451 group kerfuffle at VMworld). VMware believes it will be a while and in the meantime this can hugely drive NSX adoption (still something of a solution searching for a problem from what I’ve seen outside of security/microsegmentation).
  • Fact: if AWS could have put VMware out of business, they’d have done it. They’ve not been able to because refactoring apps for Platform 3/AWS style services takes time. The bottleneck IMHO here is developer quantity – The New Kingmakers hypothesis only goes so far if there’s just not enough folks with the right skill sets (why AWS wants their channel partners to be “born in the cloud” and HEAVILY focused on services).

So to me I read this as a play where AWS & VMware are both betting on the timeframe around refactoring applications to Platform 3 (provide me a better buzzword and I’ll happily use it – maybe I should just say Cloud Native Applications instead).

VMware is betting it will take longer than AWS thinks and in return they’ll get widespread NSX adoption plus preserve + build on being the dominant place for Platform 2/2.5 apps for the foreseeable future (I do also completely agree about Vmware+AWS concerns around AzureStack).

AWS is betting that applications will get refactored faster than VMware does and in the meantime why not 1) drive even more economy of scale + revenue, 2) let people get closer to the AWS services/API’s and remove some potential hindrances to consuming them (“wow…elastic DRS is cool…what if our applications could that that at an app-level? Oh…all these cool next gen services/APIS’s are in the same datacenter at LAN latency….wow!”), and 3) embrace VMware so tightly VMware may not be able to let go.

What if vSphere on AWS actually becomes dominant? Is that a bigger risk for AWS or VMware? Data still has weight and takes time to move.

Call it a deal with the devil…for both sides.😉 Who’s the smarter devil though?

Side-note: while I’ve not searched for it, what’s also been interesting to me is that there’s been zero commentary on all the traditional storage criteria that usually drive 70%+ of virtual infrastructure cost (IOPs, latency, space, etc.).

What’s not obvious to me is VMware’s next play if vSphere on AWS and NSX dominate – that’s great but there’s got to be a play after that as I don’t think anyone thinks Platform 2/2.5 apps will be a sufficient TAM for more than 5-15 years (I’m sure some of the smarter folks than me at VMware have lots of thoughts on that front). Hopefully by then NSX and the oft-touted but rarely realized Automation+etc. plays of the vRealize Suite will actually amount to something.

What do you think? I’d love to hear your thoughts on this topic.


03 Nov 16:32

vSAN,… correction its VSAN,… OK, OK, its vSAN now, VSAN and Virtual SAN are wrong.

by Peter Keilty
EMC logo
I spent 4 years at EMC prior to moving to VMware over 3 years ago to join the Software Defined Storage team. At EMC it was always a challenge to get the acronyms and names correct. When Acadia (VCE) first came out with the “Vblock” everybody wanted to type it as “vBlock”. I’d always try […]
03 Nov 16:31

DellEMC World - Austin, Texas - Part 1

by emc-community-network@emc.com (Allen Ward)
EMC logo
Disclaimer  Everybody loves a good disclaimer... or maybe that's just me.But either way I want to drop a disclaimer at the beginning of this post about DellEMC World to make sure everyone is on the same page.This post is mostly about my recent
03 Nov 16:31

XtremIO Directions (AKA A Sneak Peek Into Project-F)

by Itzik Reich
EMC logo

At DellEMC World 2016, together with Chris Ratcliffe (SVP, CTD Marketing) and Dan Inbar (GM XtremIO & FluidFS), we gave a session on what we call “XtremIO Future Directions”, we really wanted to show without getting into too much details where we are heading in the next few years, think of it as a technical vision for the near term future.

We started by giving a quick overview of the business, we think that Dell EMC XtremIO is the FASTEST growing product ever seen in the storage business, with more than 40% of the total AFA market share since our GA in November 2013 is something that can’t be taken lightly. For me personally, I can say that the journey has been amazing so far, as a relatively young father, I think of the acceleration the product had to go through in such a short time, the market demand is absolutely crazy! More than 3,000 unique customers and over $3Bn in cumulative revenue. From a technical perspective, If I try to explain the success of XtremIO over other AFAs, it really boils down to “purpose built architecture” – something which many other products now claim, when we built XtremIO following four pillars were (and in my opinion still are) mandatory building blocks:

  1. Purpose-Built All Flash Array

    We had the luxury of writing the code using a clean slate, that meant SSD’s only in our case and many unique features e.g. XDP, that could never have happened in the old world (you can read more about XDP here: http://www.emc.com/collateral/white-paper/h13036-wp-xtremio-data-protection.pdf)

  2. Inline All The Time Data Services

    XtremIO is a CAS (Content Aware Storage) architecture, many customers think of dedupe (for example) as a feature, in the case of XtremIO, it really isn’t. In the old days we described deduplication as a “side effect” of the architecture but since “side effect” is normally thought of as a bad thing, we took that terminology out J. But seriously, what I mean is that we examine the data in real time and give each chunk of data a unique signature and by doing so, when we write the data, if the signature already exists, we simply dedupe it, there is no post process hashing like so many products out there, no “throttling” to the CAS engine etc. This is a killer feature not just because of data savings but rather HOW the data is being written and evenly distributed in real time across all the available nodes and drives, apart from the CAS/Deduplication engine, we of course compress all the data in real time, no knobs needed. We also encrypt the data and everything is done while using thin provisioning so you really only store the unique data you are producing, again, all in real time.

  3. Scale-Out IOPS & Bandwidth

    Wow, this one is funny – I remember in the old days spending hours explaining why a Scale-Out architecture is needed and had to debunk competitors claims like “no one needs 1M IOPS” and “no one needs more than a dual-controller architecture”. While I would agree that not everyone needs it, if I look at our install base, so many DO. It’s also not just IOPS, the tightly coupled Scale-Out architecture is what gives you the bandwidth and low latency that your applications need.

  4. CDM (or, unique Snapshots capabilities)

    If IDC/Gartner are right by saying that more than 60% of the data stored in your datacenter is actually multiple copies of your data and if your storage array can’t cope with these copies in an efficient way (note that “efficient” is not just capacity but also without performance penalty) then, you’re not as agile as you could be and your TCO goes out of the window – read more about it here:

    http://wikibon.com/evolution-of-all-flash-array-architectures/

    Luckily, XtremIO snapshots have been specifically designed to be ultra-efficient and as a result we see many customers that are using them not just for the primary data itself but for its copies as well, you can read more about it here https://itzikr.wordpress.com/2014/05/12/xtremio-redefining-snapshots/

    https://www.emc.com/collateral/solution-overview/solution-overview-xtremio-icdm-h14591.pdf

    Moving on to our longer term vision for XtremIO, what’s interesting is the that the current XtremIO array, (internally known as X1), was never the endgame, it’s rather a STEP on the road to what we believe customers will need in the coming years. The issue we have faced is that to build what we want means that many new technologies need to become available. We are gradually implementing them all, each cycle, with what’s available from a hardware/software perspective.


The current architecture is doing an amazing job for:

  • Acceleration (the ability to simply accelerate your workload, e.g. your Database) by moving it to XtremIO.
  • Providing very good copy data management (iCDM)
  • Consolidation of large environments into the cluster (think virtualized environments with thousands of VMs residing in the large XtremIO cluster configurations, up to 8 X-Bricks, all 16 storage controllers running in an Active/Active fashion with global dedupe and compression)

We aren’t stopping there, we are going to take scalability to different dimensions, providing far denser configurations, very flexible cluster configurations and…new data services. One of them is the NAS add-on that was the highlight of the session we had at Dell EMC World. Note that it is only one of a number of new data services we will be adding. So why did we mention NAS specifically if we are going to introduce other data services as well? It’s very simple really, this is the first “Dell” and “EMC” World, we wanted to highlight a true UNIFIED ENGINEERED
solution coming from both companies which are one now (Dell EMC).

Before moving to read ahead about the NAS part, we also highlighted other elements of the technical roadmap e.g. the ability to really decouple compute (performance) from capacity, ability to leverage some elements of the Software Defined Storage into the solution and to really optimize the way we move data, not just in the way it lands in the array but rather going sideways to the cloud or other products. Here again, the core CAS architecture comes into play, “getting the architecture right” was our slogan in the early days, you now understand why it was so important to make it right the first time..

Ok, so back to the NAS Support! We gave the first peek into something internally called “Project-F and I must say, the response has been overwhelming so we thought we should share it with you as well – please note that a lot of this is still under wraps and as always, when you deal with roadmap items, the usual disclaimers apply – roadmaps can change without notice etc.

Ok, so what is it?

During 2017, Dell EMC will release an XtremIO-based unified block and file array. By delivering XtremIO NAS and block protocol support on a unified all-flash array, we plan to deliver the transformative power and agility of flash in modern applications to NAS. XtremIO is the most widely adopted All-Flash platform for block workloads. However, we recognized an opportunity to extend our predictable performance, inline data services, and simple scalability to file based workloads. This is the first unified, content-aware, tightly-coupled, scale-out, all flash storage with inline data services to provide consistent and predictable performance for file and block.
The main objective of the Dell/EMC merger is to create an entity that is greater than the sum of its parts. Not just operational synergies, but technical synergies that enable new compelling solutions. The new XtremIO file support feature set is the first of many synergies to come. XtremIO’s NAS capabilities are based on a proven NAS file system (FluidFS) from DELL Technologies and offers:

  • Full NAS feature-set
  • Supports Multiple NAS protocols (NFS, SMB/CIFS, HDFS and NDMP)
  • Over 1M NAS IOPS, predictable sub-millisecond latency
  • Enterprise-grade scalability, and reliability

A common question that I get is “don’t you already have other NAS solutions?” to me, this question is silly. EMC and now, Dell EMC has always been about a portfolio approach. Lets ignore NAS for a second, wouldn’t this question be applicable to Block based protocols (and products) as well? Of course it will, and as in Block, different products are serving different use cases, for the XtremIO NAS solution, we were looking for a platform that can scale out in a tightly-coupled manner, where metadata is distributed, one that can fit the use cases below. Again, there is nothing wrong with the other approaches, each has their cons/pros for different use cases which is the beauty of the portfolio, we don’t force the use case to one product, we tailor the best product to the specific customer use case. Regardless of the problem you are trying to solve, Dell EMC has a best-of-breed platform that can help. If you want to learn more about the storage categories, I highly encourage you to read chad’s post here http://virtualgeek.typepad.com/virtual_geek/2014/01/understanding-storage-architectures.html

Target Use Cases

This solution targets workloads that require low latency, high performance, and multi-dimensional scalability: Transactional NAS applications are well-suited use cases for the XtremIO NAS capabilities. A few examples are VSI, OLTP and VDI and mixed-workloads including TEST & DEV, and DEVOPS. These workloads will also leverage XtremIO’s data services such as compression and deduplication.

A good rule of thumb is, if you are familiar with the current XtremIO use cases and want them to be applied over NAS/SMB, that is a perfect match.

File Capabilities & Benefits

With its addition of NAS protocol support, XtremIO can deliver all this with its unique scale-out architecture and inline data services for Transactional NAS and block workloads. Storage is no longer the bottleneck, but enables database lifecycle acceleration, workload consolidation, private/hybrid clouds adoption, and Transactional NAS workload optimization. The key features and benefits include:

Unified All-Flash scale-out storage for block and file
• Multi-protocol support of FC, iSCSI, NFS, SMB, FTP, HDFS and NDMP
• Predictable and consistent high performance
• In memory metadata architecture w/ inline data service


Scalable
• Over 1M NAS IOPS w/ sub-millisecond latency
• Billions of files
• Petabytes scalability
• Elasticity – scale-out for block and file, grow when needed, the NAS part can scale from one appliance (2 Active/Active controllers) to 6 appliances (12 Active/Active controllers!)


Simple
• Single unified management (scroll down to see a demo of it)

Resilience
• Built-in replication for file
• Enterprise grade availability
• Proven technology from DELL and EMC


Efficiency
• Inline deduplication and compression
• Efficient virtual copy technology for block and file
• Thin provision support

Both file and block will be using the XtremIO inline data services such as encryption, inline compression and deduplication. In addition, for file workloads native array based replication is available.

Other technical capabilities includes:
• ICAP for antivirus scan
• Virtual copy technology
• Remote replication
• Quotas (on name, directories or users)

Below, you can see a recorded demo on the upcoming HTML5 UI, note that it is different than the web UI tech preview that we introduced in july 2016 ( https://itzikr.wordpress.com/2016/06/30/xios-4-2-0-is-here-heres-whats-new/ ), yes, as the name tends to suggest, during the “tech preview”, we have been gathering a lot of customers feedback about what they want to see in the future UI of XtremIO (hence the name, tech preview)



If you want to participate in a tech preview then start NOW! Speak to your DellEMC SE / Account Manager and they’ll be able to help you enroll.

P.S

the reason we called the NAS integration “Project-F” is simple, the original XtremIO product had a temporary name, “Project-X”🙂

2c1cda00-c03d-4e57-a661-9e9f9bbf5e2d-original

==Update==

you can now watch an high quality recording of the session itself here

 


03 Nov 16:17

Free eBook: Using SQL Server 2016 for Data Science & Advanced Analytics

by SQL Server Team

Reposted from the Cortana Intelligence & Machine Learning blog.

Data Science with Microsoft SQL Server 2016

The world around us – every business and nearly every industry – is being transformed by technology. SQL Server 2016 was built for this new world and to help businesses get ahead of today’s disruptions. With this free eBook, you will learn how to install, configure and use Microsoft’s SQL Server R Services in your data science and advanced analytics projects.

Read the Foreword to the eBook and download it from here.

03 Nov 16:15

Getting a history of database snapshot creation

by Paul Randal

Earlier today someone asked on the #sqlhelp Twitter alias if there is a history of database snapshot creation anywhere, apart from scouring the error logs.

There isn’t, unfortunately, but you can dig around the transaction log of the master database to find some information.

When a database snapshot is created, a bunch of entries are made in the system tables in master and they are all logged, under a transaction named DBMgr::CreateSnapshotDatabase. So that’s where we can begin looking.

Here’s a simple example of a database snapshot:

USE [master];
GO

IF DATABASEPROPERTYEX (N'Company_Snapshot', N'Version') > 0
BEGIN
    DROP DATABASE [Company_Snapshot];
END
GO
IF DATABASEPROPERTYEX (N'Company', N'Version') > 0
BEGIN
    ALTER DATABASE [Company] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE [Company];
END
GO

-- Create a database
CREATE DATABASE [Company];
GO

-- Create the snapshot
CREATE DATABASE [Company_Snapshot]
ON (NAME = N'Company', FILENAME = N'C:\SQLskills\CompanyData.mdfss')
AS SNAPSHOT OF [Company];
GO

And I can find the transaction using the following code, plus who did it and when:

USE [master];
GO

SELECT
    [Transaction ID],
    SUSER_SNAME ([Transaction SID]) AS [User],
    [Begin Time]
FROM fn_dblog (NULL, NULL)
WHERE [Operation] = N'LOP_BEGIN_XACT'
    AND [Transaction Name] = N'DBMgr::CreateSnapshotDatabase';
GO
Transaction ID User             Begin Time
-------------- ---------------- ------------------------
0000:00099511  APPLECROSS\Paul  2016/10/20 13:07:53:143

Now to get some useful information, I can crack open one of the system table inserts, specifically the insert into one of the nonclustered indexes of the sys.sysdbreg table:

SELECT
    [RowLog Contents 0]
FROM fn_dblog (NULL, NULL)
WHERE [Transaction ID] = N'0000:00099511'
    AND [Operation] = N'LOP_INSERT_ROWS'
    AND [AllocUnitName] = N'sys.sysdbreg.nc1';
GO
RowLog Contents 0
-------------------------------------------------------------------------------------
0x26230000000100290043006F006D00700061006E0079005F0053006E0061007000730068006F007400

Bytes 2 through 5 (considering the first byte as byte 1) are the byte-reversed database ID of the snapshot database, and bytes 10 through the end of the data are the sysname name of the database. Similarly, grabbing the insert log record for the nonclustered index of the sys.syssingleobjrefs table allows us to get the source database ID.

Here’s the finished code:

SELECT * FROM
(
SELECT
    SUSER_SNAME ([Transaction SID]) AS [User],
    [Begin Time]
FROM fn_dblog (NULL, NULL)
WHERE [Transaction ID] = N'0000:00099511'
    AND [Operation] = N'LOP_BEGIN_XACT'
) AS [who],
(
SELECT
    CONVERT (INT,
        SUBSTRING ([RowLog Contents 0], 5, 1) +
        SUBSTRING ([RowLog Contents 0], 4, 1) +
        SUBSTRING ([RowLog Contents 0], 3, 1) +
        SUBSTRING ([RowLog Contents 0], 2, 1)) AS [Snapshot DB ID],
    CONVERT (SYSNAME, SUBSTRING ([RowLog Contents 0], 10, 256)) AS [Snapshot DB Name]
FROM fn_dblog (NULL, NULL)
WHERE [Transaction ID] = N'0000:00099511'
	AND [Operation] = N'LOP_INSERT_ROWS'
	AND [AllocUnitName] = N'sys.sysdbreg.nc1'
) AS [snap],
(
SELECT
    CONVERT (INT,
        SUBSTRING ([RowLog Contents 0], 5, 1) +
        SUBSTRING ([RowLog Contents 0], 4, 1) +
        SUBSTRING ([RowLog Contents 0], 3, 1) +
        SUBSTRING ([RowLog Contents 0], 2, 1)) AS [Source DB ID]
FROM fn_dblog (NULL, NULL)
WHERE [Transaction ID] = N'0000:00099511'
	AND [Operation] = N'LOP_INSERT_ROWS'
	AND [AllocUnitName] = N'sys.syssingleobjrefs.nc1'
) AS [src];
GO
User             Begin Time               Snapshot DB ID Snapshot DB Name  Source DB ID
---------------- ------------------------ -------------- ----------------- ------------
APPLECROSS\Paul  2016/10/20 13:07:53:143  35             Company_Snapshot  22

I’ll leave it as an exercise for the reader to wrap a cursor around the code to operate on all such transactions, and you can also look in the master log backups using the fn_dump_dblog function (see here for some examples).

Enjoy!

The post Getting a history of database snapshot creation appeared first on Paul S. Randal.

03 Nov 16:15

Post PASS Summit 2016 Resolutions - Followup

by drsql
Last year I started a new tradition for myself that will last until I specifically retire from the SQL Server community, something that I practically promise isn't coming in next year's resolutions. As this is the end of the PASS year, with the Summit...(read more)
03 Nov 16:14

The sniffing database

by Hugo Kornelis
Your SQL Server instances, like people with hay fever that forget to take their antihistamines during summer, is sniffing all the time. Sniffing is a trick employed by the optimizer in an attempt to give you better execution plans. The most common form of sniffing is parameter sniffing. Many people know about parameter sniffing, but there are a lot of misconceptions about this subject. I have heard people describe parameter sniffing as a bad thing, and I know people who claim that parameter sniffing...(read more)
03 Nov 16:14

Technical review–SysTools SQL Recovery

by Damian

SQL Server is a relational database that saves the entire data, which can be easily repaired by software when it is required. SQL MDF files are the main database files of SQL database. They are the base files of the primary data of SQL server and any damage to these files may result in the fall down of the complete database. Therefore any sort of corruption in these MDF files should be repaired and eliminate the risk of corruption of the entire database. The most common reason that is behind the corruption of MDF files is improper system shutdown. In addition, the SQL database may be infected with viruses and malwares that not only corrupt a single file but also have the capacity to destroy complete database. Now, the question comes how to retrieve the corrupted SQL database? There is one solution to overcome from such a situation is to utilize the third-party utility, i.e. SysTools SQL Recovery Software that repairs corrupt SQL database. This review discusses about the software  is based upon the testing performed by the application.

Introduction

This SQL file repair tool can repair both corrupted MDF as well as NDF files and export it to different file formats of a SQL Database. It repairs both primary and secondary database in, as it is form. It scans and retrieve numerous of NDF files or complete folder containing all NDF files. It supports SQL Server 2014 and all the below editions to recover database from mdf and ndf files.

Different Versions to Repair Corrupt SQL Database

The SQL Recovery software is available in two versions, i.e. Demo as well as Licensed. Users can select any version accordingly.

· Demo Version

The Demo version is totally free to utilize by simply downloading it from the official website of the SysTools Group. As it is a free edition, so there are limitations, i.e. it permits users to preview all the repaired components of SQL database, but cannot allow them to store and export them.

· Licensed Version

It is a paid edition of the software that can be bought from the official site of organization. It allows users to repair, save, and export all the data into SQL Server database.

Features of SQL .mdf File Repair Tool

Repair both MDF & NDF files

The tools help to retrieve both primary as well as secondary database files such as MDF and NDF files. It maintains the originality of file data after repairing the database table’s data and even previews the complete components in it.

Automatically Detects Server Edition

The application is programmed in a way that it automatically detects the edition of both NDF as well as MDF files. However, if the edition is known by the servers then, users can check it manually on their own.

Dual Scanning Way

The SQL .mdf file repair tool provides two modes for scanning the complete SQL server database file. One mode is Quick scan that is for normal database corruption. Another mode is advance scan, which is for the deep corruption of the database files (.mdf & .ndf).

Dual Scanning Way

Store Scanned Files For Further Usage

The SQL database repair tool provides the option to save the scanned data in .str file format. This feature enables the user to avoid re-scanning of the corrupted .mdf or .ndf file again and again.

Supports Both Database Keys

The SQL database repair tool supports both the primary as well as foreign keys along with database tables. Both the keys are supported after exporting the database that is performed via a software.

Supports Advance Data Types

The software supports the advance data types to repair SQL database that includes sql_variant, hierarchyid, geography, geometry data types, Datetime2, datetimeoffset, etc. Moreover, the application supports Unicode as well as ASCII XML data types.

Quickly Recover  SQL .mdf File Components

Once the database files scanned by SQL database recovery tool, It provides a preview of recovered items like Tables, Functions, Stored Procedures, Views, Triggers, Rules, Associated Primary Keys, Data Types, Unique Keys, and all the other SQL database components.

Quickly Recover  SQL .mdf File Components

 

Retrieve Deleted Table Data

The data of SQL table gets deleted accidentally due to various situations. In such a situation, the software will help to recover the deleted data of table in exact form without losing the data.

Repair SUSPECT SQL Server Database

If the user is unable to access the database because of SUSPECT error then Software will helps to repair database which is marked as suspected.

Export to SQL Server Database with Desired SQL Files

The utility gives an option to export the repaired data directly to live SQL Server database. The users only need the credentials such as Database name, Username, Server name and password etc.

After providing the above necessary information, user can export the desired database items from both MDF and NDF files and store it. It permits users to check or unchecked the recovered items accordingly and then exporting it to SQL Server database or as a SQL Server Compatible SQL Scripts.

Export to SQL Server Database with Desired SQL Files

 

Retains the Data Integrity

The tool repairs the complete data files of SQL Server in exact form. It preserves the on-disk folder structure, formatting, styling, etc. in the same form as it was.

  Conclusion

Considering the overall performance of the application I have to say I like it. It is also very intuitive and works very quickly. Although the software fails to repair the bulk, MDF files. However, it accurately repair highly corrupted data files (tested on 3 damaged databases). Moreover, it has easy and user-friendly interface, which is very necessary for users in repairing a corrupt SQL Server database. To sum it all up my recommendation for all the numerous of users is that this application is definitely worth trying.

03 Nov 16:14

What you should read :) during the upcoming !!! weekend - weekly SQL Server blogs review part 16

by Damian

I have found some great articles to be read

Hope you will have a lot of fun reading all of them.

 

Author

Title

Link

Adam Machanic

sp_whoisactive: The Big Fixes

http://bit.ly/2f1ddJA

Jorg Klein

Azure Data Lake Analytics U-SQL decryption possibilities

http://bit.ly/2dQ03MT

CSS SQL Server Engineers

How It Works: Online Index Rebuild – Can Cause Increased Fragmentation

 

http://bit.ly/2ehn3nL

CSS SQL Server Engineers

SQL 2016 — Why can’t I STRETCH my database (I have the right user name and password)?

http://bit.ly/2eCLqg4

Grant Fritchey

A Subquery Does Not Hurt Performance

http://bit.ly/2f1mZvc

 

Cheers

Damian

03 Nov 16:13

Top 5 Announcements at PASS Summit 2016

by SQL Server Team

This post is by Joseph Sirosh, Corporate Vice President of the Data Group at Microsoft.

In June, we’ve announced the general availability of SQL Server 2016, the world’s fastest and most price-performant intelligence database for HTAP (Hybrid Transactional and Analytical Processing) with updateable, in-memory columnstores and advanced analytics through deep integration with R Services. As we officially move into the data-driven intelligence era, we continue to bring new capabilities to more applications, environments and users than ever before.

Today, we’re making several announcements to bring even more value to our customers.

1. Public Preview of Azure Analysis Services

PASS_Announce_1

Azure Analysis Services (Azure AS) – Based on the proven analytics engine in SQL Server Analysis Services, Azure AS is an enterprise grade OLAP engine and BI modeling platform, offered as a fully managed platform-as-a-service (PaaS). Azure AS enables developers and BI professionals to create BI Semantic Models that can power highly interactive and rich analytical experiences in BI tools such as Power BI and Excel. Azure AS can consume data from a variety of sources residing in the cloud or on-premises (SQL Server, Azure SQL DB, Azure SQL DW, Oracle, Teradata to name a few) and surface the data in any major BI tool. You can provision an Azure AS instance in seconds, pause it, scale it up or down (planned during preview), based on your business needs. Azure AS enables business intelligence at the speed of thought! For more details, see the Azure blog.

2. SQL Server 2016 DW Fast Track Reference Architecture, 145TB

We have collaborated with a number of our hardware partners on a joint effort to deliver validated, pre-configured solutions that reduce the complexity and drive optimization when implementing a data warehouse based on SQL Server 2016 Enterprise Edition. Today, I am happy to announce the Data Warehouse Fast Track (DWFT) reference architectures that certify that a SQL Server 2016 SMP unit can support active data sets of up to 145TB and maximum user size of 1.2 Petabytes of SQL Data in an all flash system. These reference architectures provide tested and validated configurations and resources to help customers build the right environment for their data warehouse solutions. Following these best practices and guidelines will yield these tangible benefits:

  • Accelerated data warehouse projects with pre-tested hardware and SQL Server configurations.
  • Reduced hardware and maintenance costs by purchasing a balanced hardware solution and
  • optimizing it for a data warehouse workload.
  • Improved ROI by optimizing software assets.
  • Reduced planning and setup costs by leveraging certified reference architecture configurations.
  • Predictable performance by properly configuring and tuning the system.

PASS_Announce_2

3. Data Migration Assistant and Database Experimentation Assistant

Today, I am also happy to announce that we are releasing the Data Migration Assistant (DMA) v2.0. DMA delivers scenarios that reduce the effort to upgrade to latest SQL Server 2016 from legacy SQL Servers by detecting compatibility issues that can impact database functionality after an upgrade. It recommends performance and reliability improvements for your target environment and then migrates the entire SQL Server database. Furthermore, DMA provides seamless assessments and migrations to SQL Azure VM.  DMA assessments discover the breaking changes, behavioral changes and depreciated features that can affect your upgrades. DMA also discovers the new features in the target SQL Server platform that your applications can benefit from after an upgrade.  DMA is the only tool providing comprehensive data platform movement capabilities, assisting DBAs with more than just schema and data migrations. DMA V1.0 was released on August 26, 2016 for general availability. Since then, DMA has been downloaded more than 2,000 times world-wide assessing more than 25,000 (37K Cores) databases with over 1,000 unique users.

Another tool that we are bringing to the market today is Database Experimentation Assistant (DEA). It’s a new A/B testing solution for SQL Server upgrades. It enables customers to conduct experiments on database workloads across two versions of SQL Server. Customers who are upgrading from older SQL Server versions (starting 2005 and above) to any new version of the SQL Server will be able to use key performance insights, captured using a real world workload to help build confidence about upgrade among database administrators, IT management and application owners by minimizing upgrade risks. This enables truly risk-free migrations.

4. Azure SQL Data Warehouse Expanded Free Trial

I am particularly excited about announcing the exclusive Azure SQL Data Warehouse free trial. Starting today, customers can request a one-month free trial for Azure SQL Data Warehouse. You can bring your data in and try out the capabilities of SQL Data Warehouse and complete POCs. This is a limited time offer, so submit your request now here. For PASS attendees (please watch my keynote) we have a special referral code that you can use while requesting the free trial. The SQL Data Warehouse team at PASS can also help you set up your free trial while you are there, you can find them at the Microsoft booth.

PASS_Announce_3

5. Cognitive Toolkit (formerly known as CNTK)

PASS_Announce_5

Starting today, we are announcing the availability of a beta for Microsoft Cognitive Toolkit (formerly CNTK), a free, easy-to-use, open-source, commercial-grade toolkit that trains deep learning algorithms to learn like the human brain. The Cognitive Toolkit enables developers and data scientists to reliably train faster than other available toolkits on massive datasets across several processors, including CPUs, GPUs and FPGAs, as well as multiple machines. Upgrades include more programming flexibility, advanced learning methods like reinforcement learning and extended API support for training and inference from Python, C++ and BrainScript so developers can use popular languages and network. Cognitive Toolkit is available under an open-source license to the public, and it is one of the most popular deep learning projects on GitHub. It is used to develop commercial grade AI in popular Microsoft products like Skype, Cortana, Xbox and Bing. Developers and researchers can start training with the Microsoft Cognitive Toolkit for free by visiting https://aka.ms/cognitivetoolkit.

Learn more about these announcements from my keynote tomorrow morning at PASS Summit 2016 at Washington Convention Center or via live-stream.

PASS_Announce_4

@josephsirosh

03 Nov 16:13

Migrating Databases to Azure SQL Database

by Tim Radney

As time passes, more companies are migrating to, or at least evaluating, Azure SQL Database as an alternative to the high cost of running SQL Server on premises.

Checking Compatibility

Azure SQL DatabaseOne of the first aspects of moving your database to Azure SQL Database is to check for compatibility and Microsoft gives you numerous ways to do this for Azure SQL Database V12 (hereafter referred to as just ‘V12’). One of these methods is using SQL Server Data Tools for Visual Studio (SSDT) which uses the most recent compatibility rules to detect V12 incompatibilities. In SSDT you can import your database schema and build a project for a V12 deployment, and if any incompatibilities are found, they can be corrected within SSDT and then synchronized back to the source database.

You can also use a command-line tool called SqlPackage that can generate a report of compatibility issues (and always make sure you have the latest version). SQL Server Management Studio is another way of doing it, using the Export Data-tier Application feature, which can detect and report errors to the screen. If no errors are detected, you can then migrate your database to V12. If incompatibilities are detected, they can be corrected using SSMS prior to migration.

Data Migration Assistant is a stand-alone tool (easily confused with the SQL Server Migration Assistant) that can be used to help reduce the effort to upgrade, and replaces the SQL Server 2016 Upgrade Advisor Preview. DMA can also recommend performance and reliability improvements. Another tool, SQL Azure Migration Wizard (SAMW), is a Codeplex tool that uses Azure SQL Database V11 compatibility rules to detect V12 incompatibilities. SAMW can also complete the migration to V12 and fix some compatibility issues. Something to be aware of when using SAMW is that it can detect incompatibilities that don’t need to be fixed.

Migrating Data

Once your database has passed the compatibility check, you have to determine the best method to migrate. Some of the more common methods include using the SSMS Migration Wizard, exporting to a BACPAC, using transactional replication, or manually scripting the databases and inserting your data.

Using the SSMS Migration Wizard is great for SQL Server 2005 and up databases that are small to medium size. You can activate this wizard in SSMS 2016, by right clicking on a database, choose Tasks, and then Deploy Database to Microsoft Azure SQL Database. In SSMS 2014 it’s called Deploy Database to Windows Azure SQL Database. Using this wizard allows you to specify the database to migrate, connect to your Azure subscription, chose the location for the .bacpac file, the new database name, and which tier to migrate to. When you click finish, the wizard will extract and validate the schema and then export the data. Once the data is exported, it will create a deployment plan and begin importing the data into the new V12 database.

DBProjectVery similar to the SSMS Migration Wizard is the Export Data-tier Application. To select this option, right click on the database, chose Tasks, and then Export Data-tier Application. In the export settings you specify where you would like to create the .bacpac file. You can save this locally, or save it directly to your Azure storage account. There is also an advanced tab where you can select which tables to include. This is helpful if your local database contains copies of tables that you don’t wish to migrate to V12. When you chose Finish, it will export your data. You can then connect to your Azure server through SSMS and chose to Import Data-tier Application. You will specify the location of the file, the database name and tier of the Azure SQL Database. When you chose finish, the database will begin importing. This method gives you slightly more control over the process since it separates the export from the import. It also gives you the option of storing the .bacpac file in your Azure storage account so that the more vulnerable import process won’t be dependent on your internet connection.

An option when using either the SSMS Migration Wizard or the Export Data-tier Application, is to create an Azure VM with SQL Server and set up log shipping. This will pre-stage your database in the Azure cloud to help minimize the import time of the database. When it comes time to do the migration, you just restore the final log backups on the secondary and then remove log shipping. To bring the database online, perform a restore with recovery. This will essentially eliminate the time it takes to copy your database from your data center into the Azure data center.

Transactional replication is another method to help reduce the downtime of migrating to V12. It’s the best option if you have an extremely small maintenance window to switch over to V12, if the database can support transactional replication. Setting up transactional replication requires primary keys for each published table, which can be problematic for a lot of databases. Configuring transactional replication can also be complicated, as you have to set up the distribution database, set up the publisher and subscriber, and monitor jobs.

Generate ScriptsYou can also manually migrate by using the Generate Scripts wizard and scripting out the database schema and/or data. You can then create an empty database in Azure and import your schema and or data, by executing the script. I have heard of some people using this method to create the empty database and then manually inserting their data one table at a time using SSMS and a linked server. This method may work for you, but it can also be very complicated if you have a lot of schema constructs like foreign key relationships and identity columns, in which case the other methods above would be more reliable and efficient.

Other Migration Considerations

When planning to migrate on premises databases to V12, the size of the database is a huge factor in how long the migration will take. The export of the database, the transfer of the data, and the import will all increase in proportion to the size of the database.

Another big factor in the restore/import time when moving your databases to V12 is the performance tier you are restoring too. The restore/import process requires a lot of horsepower, so to help expedite your migration, you should consider restoring to a higher performance tier. When the database is online, you can easily and quickly drop down to a lesser tier that meets your daily performance needs. Being able to change performance tiers with a few mouse clicks is one of the big benefits of Azure SQL Database.

Monitoring

An important part of administering any database is monitoring. If you are not monitoring something, you cannot measure it. If you don’t know what your metrics are when things are working normally, how will you know which things are worse when performance is degraded? With on premises databases, we have tools that we are familiar with: SQL Server Management Studio, Performance Monitor, Task Manager, DMVs, and so on. When we move our databases to V12, we lose the ability to monitor from an OS perspective, and other concepts change a bit too. We now have this metric called DTU to work with, which stands for Database Transaction Units. Think of it as a combination of CPU, data and transaction log I/O, and memory. The Azure Portal includes a monitoring chart that defaults to having DTU percentage checked, and you can edit this chart to include additional metrics, such as:

  • Blocked by Firewall
  • CPU %
  • DTU limit
  • DTU used
  • Data I/O %
  • Database size %
  • Deadlocks
  • Failed Connections
  • In-Memory OLTP storage %
    (preview)
  • Log I/O %
  • Sessions %
  • Successful Connections
  • Total database size
  • Workers %

At a minimum, I would add CPU percentage, data I/O percentage, deadlocks, and database size percentage. Currently, this chart displays the previous hour of resource utilization.

Monitoring by DMVs has not changed much, other than having a few new DMVs related for individual database metrics and how to calculate database size. One of my previous articles here, Getting Started Tuning Performance in Azure SQL Database, covers some of the differences in the common scripts that are used for gathering disk latencies and wait stats in relation to Azure SQL Database.

Third-party tools have also started including hooks into Azure SQL Database, such as SentryOne with DB Sentry. DB Sentry gives you the ability to monitor performance and manage events that are occurring in your system. One cool feature is the Top SQL function that allows you to see the queries that have the most impact on your overall performance so you can tune/fix any issues with that query. Another is charting DTU over time and visualizing that on a dashboard alongside other important metrics.

Top SQL in the SentryOne client
Top SQL in the SentryOne client
DTU Usage on the DB Sentry Dashboard
DTU Usage on the DB Sentry Dashboard

These metrics are stored in a dedicated database, providing you the ability to baseline and to trend over performance over time, which is much better than the limited charts you currently get in the Azure Portal.

Summary

There are many benefits to migrating to Azure SQL Database V12, if your database is compatible, so take advantage of one of the available tools to check your compatibility with V12. If your database is compatible, or can be easily modified to become compatible, then you can easily migrate to Azure SQL Database V12 and begin testing and benchmarking your applications and workloads.

The post Migrating Databases to Azure SQL Database appeared first on SQLPerformance.com.

03 Nov 16:13

Azure Analysis Services

by Prologika - Teo Lachev

Despite the mantra you might hear elsewhere, my experience shows that the best self-service BI is empowering users to create reports from trusted semantic models sanctioned and owned by IT. Most of the implementation work I do involves Analysis Services in one form or another. Analysis Services has a very important role in your BI ecosystem as I explain in the “Why Semantic Layer” newsletter.

Today, at the SQL PASS SUMMIT, Microsoft announced that Analysis Services Tabular is now available as an Azure PaaS service. As a participant in the prerelease program, I had the opportunity to test Azure Analysis Services and this is why I believe you should care:

  1. If you develop cloud-based solutions, you might not have to provision a VM for Tabular anymore. Instead, you can provision an Analysis Services cloud service in seconds, just like you can provision an Azure SQL Database.
  2. You can easily scale up or down Azure Analysis Services, just like you can do this with Azure SQL Database. You can even pause it so that you don’t incur cost.
  3. You don’t have to set up a gateway for SSAS. You can use Power BI Desktop to connect to Azure Analysis Services and deploy the report to Power BI. However, you would need a gateway if your source data resides on premises so that you can process the model with on-premises data. Note that currently you can’t use Power BI Get Data to connect to Azure Analysis Services directly from Power BI. Instead, you must use Power BI Desktop.
  4. The service is highly available by default. SQL Server pros implementing highly available solutions know that this is not easy and not cheap. So, factor in high availability if you find Azure Analysis Services pricing is too high.

On the downside, as it stands Azure Analysis Services uses Azure Active Directory for security and it doesn’t support claim authentication. Power BI users will be able to authenticate but not Power BI Embedded (not yet).

Currently in preview, Azure Analysis Services is a very important addition to the Microsoft Azure BI stack that allows BI pros to implement cloud-based semantic models as they can currently do on premises.

102616_0114_AzureAnalys1.png

03 Nov 16:13

ODBC Driver 13.0 for SQL Server – Linux is now released

by SQL Server Team

This post is authored by Meet Bhagdev.

We are delighted to share the full release of the Microsoft ODBC Driver 13 for Linux –  (Ubuntu, RedHat and SUSE). The new driver enables access to SQL Server, Azure SQL Database and Azure SQL DW from any C/C++ application on Linux.

What’s new

  • Native Linux Install Experience: The driver can now be installed with apt-get (Ubuntu), yum (RedHat/CentOS) and Zypper (SUSE). Instructions on how to do this is below.
  • AlwaysOn Availability Groups (AG): The driver now supports transparent connections to AlwaysOn Availability Groups. The driver quickly discovers the current AlwaysOn topology of your server infrastructure and connects to the current active server transparently.
  • TLS 1.2 support: The driver now supports TLS 1.2 connections to SQL Server.

Install the ODBC Driver for Linux on Ubuntu 15.04

1. sudo su
2. sh -c 'echo "deb [arch=amd64] https://apt-mo.trafficmanager.net/repos/mssql-ubuntu-vivid-release/ vivid main" > /etc/apt/sources.list.d/mssqlpreview.list'
3. sudo apt-key adv --keyserver apt-mo.trafficmanager.net --recv-keys 417A0893
4. apt-get update
5. apt-get install msodbcsql
6. apt-get install unixodbc-dev-utf16 #this step is optional but recommended*

Install the ODBC Driver for Linux on Ubuntu 15.10

1. sudo su
2. sh -c 'echo "deb [arch=amd64] https://apt-mo.trafficmanager.net/repos/mssql-ubuntu-wily-release/ wily main" > /etc/apt/sources.list.d/mssqlpreview.list'
3. sudo apt-key adv --keyserver apt-mo.trafficmanager.net --recv-keys 417A0893
4. apt-get update
5. apt-get install msodbcsql
6. apt-get install unixodbc-dev-utf16 #this step is optional but recommended*

Install the ODBC Driver for Linux on Ubuntu 16.04

1. sudo su
2. sh -c 'echo "deb [arch=amd64] https://apt-mo.trafficmanager.net/repos/mssql-ubuntu-xenial-release/ xenial main" > /etc/apt/sources.list.d/mssqlpreview.list'
3. sudo apt-key adv --keyserver apt-mo.trafficmanager.net --recv-keys 417A0893
4. apt-get update
5. apt-get install msodbcsql
6. apt-get install unixodbc-dev-utf16 #this step is optional but recommended*

Install the ODBC Driver for Linux on RedHat 6

1. sudo su
2. yum-config-manager --add-repo https://apt-mo.trafficmanager.net/yumrepos/mssql-rhel6-release/
3. yum-config-manager --enable mssql-rhel6-release
4. wget "http://aka.ms/msodbcrhelpublickey/dpgswdist.v1.asc"
5. rpm --import dpgswdist.v1.asc
6. yum remove unixODBC #to avoid conflicts during installation
7. yum update
8. yum install msodbcsql
9. yum install unixODBC-utf16-devel #this step is optional but recommended*

Install the ODBC Driver for Linux on RedHat 7

1. sudo su
2. yum-config-manager --add-repo https://apt-mo.trafficmanager.net/yumrepos/mssql-rhel7-release/
3. yum-config-manager --enable mssql-rhel7-release
4. wget "http://aka.ms/msodbcrhelpublickey/dpgswdist.v1.asc"
5. rpm --import dpgswdist.v1.asc
6. yum remove unixODBC #to avoid conflicts during installation
7. yum update
8. yum install msodbcsql
9. yum install unixODBC-utf16-devel #this step is optional but recommended*

Install the ODBC Driver for SUSE12

1. zypper ar https://apt-mo.trafficmanager.net/yumrepos/mssql-suse12-release/ "mssql" #To add the repo
2. wget "http://aka.ms/msodbcrhelpublickey/dpgswdist.v1.asc"
3. rpm --import dpgswdist.v1.asc
4. zypper remove unixODBC #to avoid conflicts
5. zypper update
6. zypper install msodbcsql
7. zypper install unixODBC-utf16-devel #this step is optional but recommended*

Note: Packages for SQL Server command line tools will be available soon. The above mentioned packages only install the ODBC Driver for SQL Server that enable connectivity from any C/C++ application.

Try our Sample

Once you install the driver that runs on a supported Linux distro, you can use this C sample to connect to SQL Server/Azure SQL DB/Azure SQL DW. To download the sample and get started, follow these steps:

If you installed the driver using the manual instructions, you will have to manually uninstall the ODBC Driver and the unixODBC Driver Manager to use the deb/rpm packages. If you have any questions on how to manually uninstall, feel free to leave a comment below.

Please fill bugs/questions/issues on our issues page. We welcome contributions/questions/issues of any kind. Happy programming!

Survey and Future Plans

Please take this survey to help prioritize features and scenarios for the next release of the ODBC Driver for Linux. Going forward we plan to expand SQL Server 16 Feature Support (example: Always Encrypted), improve test coverage, and fix bugs reported on our issues page.

Please stay tuned for upcoming releases that will have additional feature support and bug fixes. This applies to our wide range of client drivers including PHP 7.0, Node.js, JDBC and ADO.NET which are already available.

03 Nov 16:13

Data Warehouse Model: Diffuse Relationships

by Michael Blaha

Click here to learn more about author Michael Blaha. A diffuse relationship is one of a group of similar relationships, which broadly apply to entities in a model. We can show them explicitly, but such an approach can become verbose and obscure the deeper content of a model. We coined the term “diffuse relationship” as […]

The post Data Warehouse Model: Diffuse Relationships appeared first on DATAVERSITY.

03 Nov 16:12

PASS Summit 2016 – Blogging again – Keynote 1

by Rob Farley

.So I’m back at the PASS Summit, and the keynote’s on! We’re all getting ready for a bunch of announcements about what’s coming in the world of the Microsoft Data Platform.

First up – Adam Jorgensen. Some useful stats about PASS, and this year’s PASSion Award winner, Mala Mahadevan (@sqlmal)

There are tweets going on using #sqlpass and #sqlsummit – you can get a lot of information from there.

Joseph Sirosh – Corporate Vice President for the Data Group, Microsoft – is on stage now. He’s talking about the 400M children in India (that’s more than all the people in the United States, Mexico, and Canada combined), and the opportunities because of student drop-out. Andhra Pradesh is predicting student drop-out using new ACID – Algorithms, Cloud, IoT, Data. I say “new” because ACID is an acronym database professionals know well.

He’s moving on to talk about three patterns: Intelligence DB, Intelligent Lake, Deep Intelligence.

Intelligence DB – taking the intelligence out of the application and moving it into the database. Instead of the application controlling the ‘smarts’, putting them into the database provides models, security, and a number of other useful benefits, letting any application on top of it. It can use SQL Server, particularly with SQL Server R Services, and support applications whether in the cloud, on-prem, or hybrid.

Rohan Kumar – General Manager of Database Scripts – is up now. Fully Managed HTAP in Azure SQL DB hits General Availability on Nov 15th. HTAP is Hybrid Transactional / Analytical Processing, which fits really nicely with my session on Friday afternoon. He’s doing a demo showing the predictions per second (using SQL Server R Services), and how it easily reaches 1,000,000 per second. You can see more of this at this post, which is really neat.

Justin Silver, a Data Scientist from PROS comes onto stage to show how a customer of theirs handles 100 million price requests every day, responding to each one in under 200 milliseconds. Again we hear about SQL Server R Services, which pushes home the impact of this feature in SQL 2016. Justin explains that using R inside SQL Server 2016, they can achieve 100x better performance. It’s very cool stuff.

Rohan’s back, showing a Polybase demo against MongoDB. I’m sitting next to Kendra Little (@kendra_little) who is pretty sure it’s the first MongoDB demo at PASS, and moving on to show SQL on Linux. He not only installed SQL on Linux, but then restored a database from a backup that was taken on a Windows box, connected to it from SSMS, and ran queries. Good stuff.

Back to Joseph, who introduces Kalle Hiitola from Next Games – a Finnish gaming company – who created a iOS game that runs on Azure Media Services and DocumentDB, using BizSpark. 15 million installs, with 120GB of new data every day. 11,500 DocumentDB requests per second, and 43 million “Walkers” (zombies in their ‘Walking Dead’ game) eliminated every day. 1.9 million matches (I don’t think it’s about zombie dating though) per day. Nice numbers.

Now onto Intelligent Lake. Larger volumes of data than every before takes a different kind of strategy.

Scott Smith – VP of Product Development from Integral Analytics – comes in to show how Azure SQL Data Warehouse has allowed them to scale like never before in the electric-energy industry. He’s got some great visuals.

Julie Koesmarno on stage now. Can’t help but love Julie – she’s come a long way in the short time since leaving LobsterPot Solutions. She’s done Sentiment Analysis on War & Peace. It’s good stuff, and Julie’s demo is very popular.

Deep Intelligence is using Neural Networks to recognise components in images. eSmart Systems have a drone-based system for looking for faults in power lines. It’s got a familiar feel to it, based on discussions we’ve been having with some customers (but not with power lines).

Using R Services with ML algorithms, there’s some great options available…

Jen Stirrup on now. She’s talking about Pokemon Go and Azure ML. I don’t understand the Pokemon stuff, but the Machine Learning stuff makes a lot of sense. Why not use ML to find out where to find Pokemon?

There’s an amazing video about using Cognitive Services to help a blind man interpret his surroundings. For me, this is the best demo of the morning, because it’s where this stuff can be really useful.

SQL is changing the world.

@rob_farley

03 Nov 16:12

How bwin is using SQL Server 2016 In-Memory OLTP to achieve unprecedented performance and scale

by Mike Weiner - SQLCAT

Written by: Biljana Lazic (bwin – Senior DBA) and Rick Kutschera (bwin – Engineering Manager). Reviewed by: Mike Weiner (SQLCAT)

bwin (part of GVC Holdings PLC) is one of Europe’s leading online betting brands, and is synonymous with sports. Having offices situated in various locations across Europe, India, and the US, bwin is a leader in several markets including Germany, Belgium, France, Italy and Spain.

To be able to achieve our goals in these increasingly competitive markets, bwin’s infrastructure is constantly being pushed to stay on top of today’s – and sometimes even tomorrow’s – technology demands. With around 19 million bets and over 250 000 active users per day, our performance and scale requirements are extraordinary. In this blog, we will discuss how we have adopted In-Memory OLTP with SQL Server 2016 to meet these demands.

Our Caching Systems:

For years we’ve depended on Microsoft AppFabric and other distributed cache systems such as Cassandra or Memcached, as one of the central pieces in our architecture, in order to meet the demanding requirements on our systems. All major components, including sports betting, poker, and casino gaming rely on this cache, which makes it a critical component for our current and future business needs. In fact, a failure of this caching system would directly translate to a total blackout of our business, making it one of the most mission critical systems in the overall architecture.

With this configuration, we faced scalability issues, and worse, we saw that with higher transaction volumes, the stability of the whole distributed cache system was not able to keep up with the workload. Even in scaling out the amount of caching nodes we still faced stability issues, leading to high availability degradation.

Additionally, there were setup and maintenance pains, which lead to a high workload overhead for various departments to keep the system operational. Even with all this reoccurring work, the results we achieved were never satisfactory.

Below is a simplified architecture of our distributed caching system:

appfabric_System

For these reasons, to become more stable and keep up with the business requirements, we were forced to consider better alternatives and overall solutions for our caching layer.

That was the moment when we realized that we already had the solution – codenamed “Hekaton”, the In-Memory OLTP engine, built within SQL Server. In fact, this was the technology we were already using for a similar solution for a while, our ASP.NET SessionState.

Our History with ASP.Net SessionState:

ASP.NET SessionState itself is a caching system that we have been using for a long time now and thanks to the performance we achieved, it became a building block for our new global caching database. But before we get further into the global caching database, let’s take a look into our history with ASP.NET SessionState.

bwin was the very first customer to use the In-Memory OLTP engine for an ASP.Net SessionState database in production, with SQL Server 2014 (long before it was called SQL Server 2014). For reference, before SQL Server 2014, our ASP.NET SessionState database was able to handle around 12,000 batch requests/sec before suffering from latch contention. We were forced to implement 18 SessionState databases and partition our workload to handle the required user load. With SQL Server 2014 In-Memory OLTP we could consolidate back to one SQL Server box and database to handle ~300,000 batch requests/sec, with the bottleneck moving to the splitting-up of data as there was no support of LOB datatypes with memory-optimized tables.

Over the years, we gathered a good deal of experience working with In-Memory OLTP for our ASP.NET SessionState solution, and the performance of the solution received positive feedback both from within and outside of our company.

With In-Memory OLTP in SQL Server 2016 extending its functionality, for example memory-optimized table support for LOB datatypes, we were able to move more code into native compiled stored procedures, and even reach a new record for our In-Memory OLTP based ASP.NET SessionState database during our research LAB engagement with Microsoft Enterprise Engineering Center (EEC) in October 2015.

With all the scalability improvements in SQL Server 2016, and migration of the all the Transact-SQL to natively compiled stored procedures and a memory-optimized table, we were able to achieve and sustain over 1.2 million batch requests/second – an improvement of 4x over our previous high watermark!

Below is the summary of performance improvements we’ve achieved by using In-Memory OLTP for our ASP.NET SessionState. We have also included the batch requests/sec from performance monitor and the measurements of waits within SQL Server

Version SessionState Performance Technology Bottleneck
SQL Server 2012 12 000 batch requests/sec SQL Server Interpreted T-SQL Latch contentions
SQL Server 2014 300 000 batch requests/sec Memory-Optimized Table, Interpreted T-SQL, Handling of Split LOBs CPU
SQL Server 2016 1 200 000 batch requests/sec Memory-Optimized Table with LOB support, Natively Compiled stored procedures CPU

clip_image001

**NOTE: In Testing we did hit spinlock issues at around 800,000 batch requests/sec, this is already resolved within SQL Server 2016 CU2 (Blog: https://blogs.msdn.microsoft.com/sqlcat/2016/09/29/sqlsweet16-episode-8-how-sql-server-2016-cumulative-update-2-cu2-can-improve-performance-of-highly-concurrent-workloads/)

With this level of performance and stability achieved, together with all the experience we gathered during years of working with the ASP.NET SessionState solution, we felt confident we could utilize this as our building block for a new global caching system based on the In-Memory OLTP engine.

Our New Global Caching System

The global caching system, based on our ASP.NET Session State implementation and In-Memory OLTP is now the replacement for all our distributed caching systems.

Below is a simplified architecture diagram of our new global caching system with SQL Server 2016 In-Memory OLTP:

clip_image005[4]

Having changed the architecture, we can now not just keep up with performance we had with our distributed cache system, but far-exceed it, using only one single database node, compared to the 19 mid-tier cache nodes previously used.

Version Cache System Performance Hardware nodes/distribution
Mid-Tier Cache solution without SQL Server 150 000 batch requests/sec 19
Solution using SQL Server 2016 1 200 000 batch requests/sec 1

In addition to reducing the number of servers needed to obtain the performance we needed, we also achieved several performance gains, as displayed in the diagram below. The graphic, which comes from our application monitoring solution, shows three things:

Usage: First, the green circular rings represent all the different products (and corresponding server counts which are represented by the numbers in the circles) dependent on the global cache. This includes numerous products from sports betting, casino gaming, bingo, all the way to our portal and sales API, just to name a few. With all these components accessing the cache, it has become even more central in our architecture than the ASP.NET SessionState.

Performance Throughput: Next, you can see our average load, that’s around 1.6 million requests per minute. With our user load scaling up, we expect to have twice the number of batch requests/sec and we expect to be able to handle even up to 20 times this load.

Performance Latency: Finally, you can see the consistent latency measured from the client around round-trip time at 1ms. This number is even more important if we directly compare it to the previous distributed cache system, where the latency varied all the time, having response times from 2ms up to 200ms.

clip_image007[4]

Also of note, as the global cache is such a central piece of our business we also have it as part of a high availability solution. In this case, if for some reason the server hosting the SQL Server database does fail we can easily move the workload to another SQL Server database.

Implementation of the Global Caching Database

While the performance gains with In-Memory OLTP have been amazing, the implementation at its core was quite simple. From the database perspective, the memory-optimized table, simply put, is just a key/value store with its primary makeup containing a key (Primary Key), a value (BLOB) and an expiration date. The application has three possible ways of interacting with this table, in the form of three natively compiled stored procedures:

  • One stored procedure to insert a value with a key into the table
  • One stored procedure to retrieve the value, by providing the key
  • One stored production to delete the value from the table.

In the background, there is a scheduled T-SQL job which deletes all expired entries from the table, on a regular basis.

From the development perspective, the impact was quite minimal. Previously, all the code necessary to access the distributed cache solutions was contained in an “abstraction layer” DLL. Hence the changes needed to use SQL Server as the caching solution were quite localized to that specific DLL, which meant no impact to the actual applications using the caching tier.

As our senior software engineer on the project noted All in all the migration of the code in [our] framework was not so difficult, we spent much more time by testing and tuning it than by the implementation itself.” – Marius Zuzcak

In the appendix below we provide code from the database, of the memory-optimized table using the LOB datatype, as well as code from the data access layer with calls to the In-Memory OLTP Global Cache.

Appendix: Code Examples

Below is the Transact-SQL code for the memory-optimized table, now with LOB datatype support:

CREATE TABLE [dbo].[CacheItems]
(      
[Key] [nvarchar](256) COLLATE Latin1_General_100_BIN2 NOT NULL,
[Value] [varbinary](max) NOT NULL,
[Expiration] [datetime2](2) NOT NULL,
[IsSlidingExpiration] [bit] NOT NULL,
[SlidingIntervalInSeconds] [int] NULL,
CONSTRAINT [pk_CacheItems] PRIMARY KEY NONCLUSTERED HASH
( [Key]) WITH ( BUCKET_COUNT = 10000000)
)WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_ONLY )

 Here we are providing pseudo-code from the data-access layer code (Hekaton.Dal).

using System;
using System.Data.SqlClient;
using System.Diagnostics.Contracts;

namespace Hekaton.Dal
{
   internal interface IHekatonDal
   {
       object GetCacheItem(string key);
       void SetCacheItem(string key, object value, CacheItemExpiration cacheItemExpiration);
       object RemoveCacheItem(string key);
   }
   internal sealed class HekatonDal : IHekatonDal
   {
       private readonly IHekatonConfiguration configuration;
       public HekatonDal(IHekatonConfiguration configuration)
       {
           Contract.Requires(configuration != null);
           this.configuration = configuration;
       }
       public object GetCacheItem(string key)
       {
           byte[] buffer;
           long length;
           ExecuteReadOperation(key, out buffer, out length, SqlCommands.GetCacheItemCommand);
           return Serialization.DeserializeBuffer(buffer, length);
       }
       public void SetCacheItem(string key, object value, CacheItemExpiration cacheItemExpiration)
       {
           ExecuteSetCacheItem(key, value, cacheItemExpiration);
       }
       public object RemoveCacheItem(string key)
       {
           byte[] buffer;
           long length;
           ExecuteReadOperation(key, out buffer, out length, SqlCommands.RemoveCacheItemCommand);
           return Serialization.DeserializeBuffer(buffer, length);
       }
       private void ExecuteReadOperation(string key, out byte[] valueBuffer, out long valueLength,
           Func<SqlConnection, SqlCommand> readCommand)
       {
           using (var connection = new SqlConnection(configuration.HekatonConnectionString))
           using (var command = readCommand(connection))
           {
               command.Parameters[0].Value = key;
               connection.Open();
               using (command.ExecuteReader())
               {
                   // Item is returned from the output parameter.
                  var value = command.Parameters[1].Value;
                   if (Convert.IsDBNull(value))
                   {
                       valueBuffer = null;
                       valueLength = 0;
                   }
                   else
                   {
                       valueBuffer = (byte[])value;
                       valueLength = valueBuffer.LongLength;
                   }
               }
           }
       }
       private void ExecuteSetCacheItem(string key, object value, CacheItemExpiration cacheItemExpiration)
       {
           using (var connection = new SqlConnection(configuration.HekatonConnectionString))
           using (var command = SqlCommands.SetCacheItemCommand(connection))
           {
                var serializedValue = Serialization.SerializeObject(value);
                command.Parameters[0].Value = key;
                command.Parameters[1].Value = serializedValue;
                command.Parameters[2].Value = cacheItemExpiration.AbsoluteExpiration;
                command.Parameters[3].Value = cacheItemExpiration.IsSlidingExpiration;
                command.Parameters[4].Value = cacheItemExpiration.SlidingIntervalInSeconds;
                connection.Open();
                command.ExecuteNonQuery();
           }
       }
   }
}

Below are the SQL Server calls to get, set, and remove items from the memory-optimized table, these calls can be utilized from the data access layer

//GetCacheItem
public override CacheItem GetCacheItem(string key, string regionName = null)
       {
           var compositeKey = BuildCompositeCacheKey(key);
           try
           {
               var result = hekatonDal.GetCacheItem(compositeKey);
               return result != null ? new CacheItem(key, result, regionName) : null;
           }
           catch (SqlException ex)
           {
               Log(CreateLogMessage(key, compositeKey, ex, "Failed to get"), ex);
               return null;
           }
           catch (SerializationException ex)
           {
               log.Error(CreateLogMessage(key, compositeKey, ex, "Failed to get"));
               return null;
           }
       }
//SetCacheItem
   public override void SetCacheItem(string key, object value, CacheItemPolicy policy, string regionName = null)
       {          
           var compositeKey = BuildCompositeCacheKey(key);
           try
           {
               hekatonDal.SetCacheItem(compositeKey, value, new CacheItemExpiration(policy, configuration));
           }
           catch (SqlException ex)
           {
               Log(CreateLogMessage(key, compositeKey, ex, "Failed to set"), ex);
           }
       }
//RemoveCacheItem
public override object RemoveCacheItem(string key, string regionName = null)
       {
           var compositeKey = BuildCompositeCacheKey(key);
           try
           {
               var result = hekatonDal.RemoveCacheItem(compositeKey);
               return result;
           }
           catch (SqlException ex)
           {
               Log(CreateLogMessage(key, compositeKey, ex, "Failed to remove"), ex);
               return null;
           }
           catch (SerializationException ex)
           {
               log.Error(CreateLogMessage(key, compositeKey, ex, "Failed to deserialize (while removing)"));
               return null;
           }
       }
03 Nov 16:11

The Future of Microsoft Logical Data Warehouse

by Prologika - Teo Lachev

Let’s face it, the larger the company, the more difficult is to achieve the dream of single enterprise data warehouse (EDW). In a typical mid-size to large organization, data is found in many data repositories and integrating all this data is difficult. I’m doing an assessment and strategy engagement now for a unit in a large organization, and they need access to at least 10 other on-premises systems, including two very large repositories. Naturally, they don’t want to import all of this data, which could be millions of rows per day, and recreate their own copy of these large corporate repositories. So what to do?

In my “QUO VADIS DATA WAREHOUSE?” newsletter, I defined a logical data warehouse (LDW), also known as data virtualization, is an emerging technology that allows you to access the data where it is. Don’t we have linked servers in SQL Server that do this? We do and they might work to a point. But what if you want scale out distributed queries to achieve better performance? In today’s SQL PASS SUMMIT keynote, Day 1 Keynote “A.C.I.D. Intelligence” (A.C.I.D stands for Algorithms, Cloud, IoT, Data), Rohan Kumar showed something that I think it’s very important and it deserves much more attention than occasional references in blogs. It showed where Microsoft is bringing PolyBase and how this technology could be the Microsoft implementation for data virtualization.

In SQL Server 2016, PolyBase allows you to access data in on-premises Hadoop cluster and in Azure Blob Storage. For example, you can store some files in HDFS and define an external PolyBase table. Then, you can have a query with heterogeneous join between a local SQL Server table and the external table. Rohan showed that Microsoft will extend PolyBase to other popular SQL and NoSQL databases. More importantly, it showed that just like an MPP appliance, such as Microsoft Analytics Platform System (APS) or Azure SQL Data Warehouse, a SQL Server node would allow you to combine multiple SQL Server instances as compute notes so that you scale out access to these data sources. For example, if you have two SQL Server compute nodes and you use PolyBase to access an Oracle database, you’ll be essentially spreading the query across these nodes in parallel and then combine the results. Of course, just like linked servers, there are technical challenges, such as cases where SQL Server might need to move data to the other node. Rohan mentioned that the SQL Server query optimizer will have smarts to optimize heterogeneous joins.

If you’re in the market for a logical data warehouse vendor, don’t rule out Microsoft. Stay tuned for more news around PolyBase and the investments Microsoft makes in this area after the Metanautix acquisition.

03 Nov 16:11

IDERA Releases Major Update to DB PowerStudio

by A.R. Guess

by Angela Guess A new press release reports, “IDERA, a leading provider of database lifecycle management solutions, today released DB PowerStudio 2016+, designed and built from the ground up by IDERA following the company’s acquisition of Embarcadero in October 2015. With this update, IDERA continues its commitment to investing in the most popular database platforms […]

The post IDERA Releases Major Update to DB PowerStudio appeared first on DATAVERSITY.

03 Nov 16:11

Artificial Intelligence and the Future: Should We Worry About Our Data Jobs?

by Amber Lee Dennis

Speaking at the DATAVERSITY® Smart Data Online 2016 Conference, Adrian Bowles, industry analyst, recovering academic, and founder of STORM Insights, and Steve Ardire, Merchant of Light, both talked about the future of Artificial Intelligence (AI). They addressed the question: “Will robots take my job?” The short answer could be summarized as: “Maybe.” Lower level jobs […]

The post Artificial Intelligence and the Future: Should We Worry About Our Data Jobs? appeared first on DATAVERSITY.

03 Nov 16:11

PASS Summit 2016 – Keynote 2

by Rob Farley

Thursday! Kilt day.

We start with Grant Fritchey (PASS’ VP of Finance, in a kilt), talking about the various metrics of PASS, which show that the community is growing both numerically and graphically, reaching 87% of countries now. It’s good to know that things are going well. This is all public information, and I’m not going to go into the details here.

He also announces that PASS will have a BA Day – Jan 11th in Chicago. More information on this will follow.

Grant hands over to Denise McInerney (PASS’ VP of Marketing). She announces new branding for the PASS organisation – logo and website (website launching early next year) – and the dates for next year’s summit.

David DeWitt, Adjunct Professor at MIT (previously of Microsoft Research) comes up. He’s going to talk about data warehouse technologies, including cloud and scaling. Amazon Redshift, Snowflake, and SQL DW.

A great session, which will have helped a lot of people appreciate SQL DW more than ever.

@rob_farley

03 Nov 16:11

PASS Summit Announcements: DMA/DEA

by James Serra

Microsoft usually has some interesting announcements at the PASS Summit, and this year was no exception.  My next few blogs will cover the major announcements.  This first one is about the Data Migration Assistant (DMA) tool v 2.0 general availability and the technical preview of Database Experimentation Assistant (DEA).

In short, customers will be able to assess and upgrade their databases using DMA, and validate target database’s performance using DEA, which will build higher confidence for these upgrades.

More details on DMA:

DMA enables you to upgrade to a modern data platform on-premises or in Azure VM by detecting compatibility issues that can impact database functionality on your new version of SQL Server.  It recommends performance and reliability improvements for your target environment.  It also allows you to not only move your schema and data, but also uncontained objects from your source server to your target server.

DMA replaces all previous versions of SQL Server Upgrade Advisor and should be used for upgrades for most SQL Server versions.  Note that the SQL Server Upgrade Advisor allowed for migrations to SQL Database, but DMA does not yet support this (it only performs the assessment).

DMA helps you to assess and migrate the following components, along with performance, security and storage recommendations, in the target SQL Server platform that the database can benefit from post upgrade:

  • Schema of databases
  • Data
  • Server roles
  • SQL and windows logins

After the successful upgrade, applications will be able to connect to the target SQL server databases with the same credentials.  Click here for more info.

Microsoft Download Center links: Microsoft® Data Migration Assistant v2.0

More details on DEA:

Technical Preview of DEA is the new A/B testing solution for SQL Server upgrades.  It enables customers to conduct experiments on database workloads across two versions of SQL Server.  Customers who are upgrading from older SQL Server versions (starting 2005 and above) to any new version of the SQL Server will be able to use key performance insights, captured using a real world workload to help build confidence about upgrade among database administrators, IT management and application owners by minimizing upgrade risks.  This enables truly risk-free migrations.

The tool offers the following capabilities for workload comparison analysis and reporting:

  • Automated script to set up workload capture and replay of production database (using existing SQL server functionality Distributed Replay & SQL tracing)
  • Perform statistical analysis model on traces collected using both old and new instances
  • Visualize data through analysis report via rich user experience

Supported versions:

Source: SQL Server 2005, SQL Server 2008, SQL Server 2008 R2, SQL Server 2012, SQL Server 2014, and SQL Server 2016

Target: SQL Server 2012, SQL Server 2014, and SQL Server 2016

Click here for more info.

Microsoft Download Center links: Microsoft® Database Experimentation Assistant Technical Preview

More info:

Top 5 Announcements at PASS Summit 2016

03 Nov 16:10

Temporal Tables

by Davide Mauri

I have delivered a talk about “SQL Server 2016 Temporal Tables” for the Pacific Northwest SQL Server User Group at the beginning of October . Slides are available on SlideShare here:

http://www.slideshare.net/davidemauri/sql-server-2016-temporal-tables

and the demo source code is — of course — available on GitHub:

https://github.com/yorek/PNWSQL-201610

The ability of automatically keep previous version of data is really a killer feature for a database since it lift the burden of doing such really-not-so-simple task from developers and bakes it directly into the engine, in a way it won’t even affect existing applications, if one needs to use it even in legacy solutions.

The feature is useful even for really simple use cases, and it allows to open up a nice set of analytics options. For example I’ve just switched the feature on for a table where I need to store that status of an object that needs to pass through several steps to be processed fully. Instead of going through the complexity of managing the validity interval of each row, I’ve just asked the developer to update the row with the new status and that’s it. Now querying the history table I can understand which is the status that takes more time, on average, to be processed.

That’s great: with less time spent doing technical stuff, more time can be spend doing other more interesting activities (like optimizing the code to improve performance where analysis shows they are not as good as expected). 

03 Nov 16:10

Power BI Reports in SSRS Techinical Preview

by Prologika - Teo Lachev

From the glimpse to the first public preview…it’s great to see one of most requested feature coming to life: ability to render online Power BI reports in on-premises SSRS. Alas, Microsoft is keeping us in suspense and no official date and release vehicles have been announced yet but we can now see and test it using the VM that Microsoft put on Azure (read the Chris Finlan’s steps to get started).

At this point, the integration supports only Power BI Desktop files that connect to Analysis Services (Multidimensional and Tabular). Attempting to deploy models connected to something else or with imported data, doesn’t work and you’ll get an error. Custom visuals and R visuals are not supported yet. For the most part, the integration is limited to report viewing only (similar to what you get if you embed Power BI reports in Power BI Embedded). That’s will be probably fine to start with but it will be nice to have Q&A, Quick Insights, Analyze in Excel, and ability to create custom reports online.

It’s great that Microsoft decided to expose the connection string as a regular SSRS data source. This will allow you to change the credentials settings, such as to impersonate the user when Kerberos is not an option and the SSAS is on another server. When the Power BI Desktop file is uploaded, it’s saved in the report catalog as any regular SSRS report (now referred to a paginated report). This means that you secure and manage Power BI reports the same way you work with paginated reports. Speaking of management, I hope that at some point Microsoft will add support for subscriptions and caching. What’s need is unification among the four types of reports: paginated, mobile, Power BI, and Excel. While waiting, we get a handy bonus feature: ability to add comments to reports, such as to get someone to formally approve what they see on the report.

Everyone is asking about SSRS support for Power BI reports. Progress has been make and I hope it won’t be long before we get the real thing.

03 Nov 16:10

Slack, IBM Partner to Bring Watson to Developers

by A.R. Guess

by Angela Guess According to a recent press release, “IBM and Slack are partnering to bring Watson to Slack’s global community of developers and enterprise users. Drawing on the power of Slack’s digital workplace and the cognitive computing capabilities of Watson, developers will be able to create more offerings — including bots and other conversational […]

The post Slack, IBM Partner to Bring Watson to Developers appeared first on DATAVERSITY.