From my previous post, we learned that despite how much we segment and bucket people, each of us is pretty unique. The reason that some of us may appear similar to a brand is because brands typically don’t have enough data to tell us apart. Big data changes that fact! With social and behavior data, we will have enough data to view a user in over hundreds of different dimensions. With so many dimensions, the chance of finding 2 matching individual along all these hundreds of different dimensions is highly improbable. We are truly unique!
The fact that we are unique demands personalization, but this is traditionally a very challenging data science problem. In fact, it’s prescriptive analytics. Last time we discussed 3 early approaches to personalization and the challenges they face.
- Ask the user to self-declare their interest and preference
- Learning from the user’s own past behaviors and recommend similar content
- Learning from other users’ behaviors (e.g. traditional collaborative filtering) and recommend content from the collective interest of others like you
So how do we (Lithium) and others (e.g. Amazon) tackle these problems.
Since people’s interests are collectively pretty unique, we generally cannot infer a person’s interest based on the interests of other look-alikes. Because there really aren’t any look-alikes in the face of big data. This means traditional collaborative filters (CF) that rely on learning and extrapolating from other users will not perform well when inferring a person’s interests. Without an accurate understanding of a user’s interest, the recommended contents will not be truly personalized.
In order to ensure a highly personalize experience for a particular user, the recommender system should only use data from that user, and not generalize across different users. However, the majority of users will not have enough existing data on the platform for the recommender system to leverage, so initial recommendations are poor (i.e. cold start), leading to a poor customer experience and ultimately abandonment of the platform. So how can we get enough data about a particular individual to achieve hyper-personalization at mass scale, for anyone and everyone?
A Social Approach to Personalization
Our Klout data science team took a novel approach to this problem. We recognized that some extrapolation is necessary to overcome the data sparsity at the individual level, but we don’t have to extrapolate from other users. We can take advantage of the richness and abundance of public social media data. Because social media is very pervasive, almost everyone uses it to some extent. So existing public social media data is available at a mass scale.
Since public social media data is persistent, and on the whole stays forever (unless the user explicitly deletes posts). This means we can look way back into history to get enough data to infer a user’s interest. Moreover, people’s interests usually don’t change rapidly, so we could recommend content based on the inferred interests of the user. This approach is basically learning from the user’s own past social media interactions.
To draw the analogy with our imaginary friend from the last post: Cortana knows you from seeing how you interact with your friends and families, who you hang out with, what you talk about, what’s your likes/dislikes, etc. From these past social interaction data, Cortana will be able to infer your interest and preference and recommend personalized products for you, even though she has never gone shopping with you.
This approach is more natural, because it is actually how we operate in our physical world. We may not know all the things our friends purchased, or every movie they watched, but from being with them in other social context (e.g. at parties, over dinner, while hiking, etc.), we can definitely learn something about their interest and recommend relevant products and movies they may like.
This approach allows us to use data from the user directly. Although we may not have enough past consumption data from the user, we are not using that data. We are using their own data from a different source (i.e. social media). So we don’t have to extrapolate from other users even though we are still extrapolating. We simply extrapolate from a different context (i.e. the social context) of the same user. This works, because from the psychology of cognitive dissonance, we know that people are generally consistent across different contexts.
This is how our Profile Plus feature offers our community members a unique personalized experience. It works very well as soon as people enable it, because we look back into history. It’s like Cortana has already known you for a long time, even at the beginning. Which means no more cold start. When consumers can immediately recognize the difference, I believe they will embrace it even more.
The Power of Hybrid Approaches
If you are still with me, you are probably wondering, didn’t Amazon solve the personalization challenge with their famous recommender system based on CF? The answer is “yes, they did.” But for me, the more interesting question is how did they overcome the challenge of traditional CF?
Amazon’s famous item-to-item CF is not really a traditional CF. It’s not recommending items based on other users who are similar to you. There is a subtle difference:
- it is “people who bought Y also bought X,” which is a recommendation of item X due it’s similar to item Y.
- and not “people like you also bought X,” which is a recommendation of item X because it’s the preference of others like you.
So Amazon is merely recommending similar items to what you’ve purchased in the past (i.e. Y, whatever it is).
The genius in Amazon’s item-to-item CF is that it leverages people’s co-ownership on any pair of items (whatever they may be) as an empirical measure of their similarity. So if a lot of people own both a GoPro camera and a Louis Vuitton bag, then these 2 items must be somehow similar, even though there may not be any apparent similarity. In this view, Amazon’s item-to-item CF is actually more comparable to the approach of learning from the user’s own past purchase behavior. However, it is also take advantage of the third approach: learning from other users’ behaviors to determine which items are similar to those you’ve purchased. Therefore, Amazon’s item-to-item CF is really a hybrid method that’s a combination of the 2 approaches.
In machine learning, it is well known that hybrid approaches tend to outperform any single approach, because each can compensate for the systematic errors of the other. There is a whole class of learning algorithms, called ensemble learning, that aims to combine different models to produce the optimal result. Note the famous Netflix Prize was first won by a team that uses ensemble methods. Our social approach that leverages people’s existing social data can also be combined with more traditional collaborative filters. Although we haven’t implemented this hybrid approach yet, it would certainly be an interesting future extension.
Conclusion
With big data, brands can finally see us in multiple dimensions and recognize us as unique. So brands really have no more excuses to not offer their customer a personalized experience. Yet, personalization is a challenging prescriptive analytic method. Amazon has had its early success using an ingenious hybrid approach that combines learning from the user’s past behavior and learning from other users’ collective behaviors.
We took a social approach to mass personalization. This approach involves learning from the user’s own social media interactions. It doesn’t extrapolate from other users, so the recommendations are truly personalized. And because public social media data is both pervasive and persistent, nearly everyone can get personalized content recommendations that should be hyper-relevant from the get-go. This is the social approach we use in Profile Plus to power our own personalized community experience.
BTW, if you are interested to learn more about Profile Plus, please join us for a fireside chat on September 28 at 11AM Pacific Time. Not only will we explore the data science behind Profile Plus in greater depth, we will also look at how it works in and discuss how it can benefit your business. And if you are really interested, we have some real customer examples to share.
Image Credit: Unsplash, geralt, and geralt.
Michael Wu, Ph.D. is Lithium's Chief Scientist. His research includes: deriving insights from big data, understanding the behavioral economics of gamification, engaging + finding true social media influencers, developing predictive + actionable social analytics algorithms, social CRM, and using cyber anthropology + social network analysis to unravel the collective dynamics of communities + social networks.
Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics + its application to Social CRM. He's a blogger on Lithosphere, and you can follow him @mich8elwu or Google+.