Facebook the Social Data Queen vs. Google the Personal Data King
“fight in the recommendation field”
Many people, including me predict that the next step of web technology will be about personalization. It will be about web, understanding and providing each users needs. This personalization will be done by recommendations as a result of Semantic web. When the recent issues between Google’s Friend Connect service and Facebook are considered, it might be seen that the value of social and personal data, which are essential for personalization/recommendation engines, are increasing. In this post, I am going to discuss recent issues of Social Data War and what do companies aim to do with this data. I am also going to talk a little about a project of mine: Iletken which is much related to the issue. The last issue I am going to discuss will be the difference between Social and Personal data, which is hard to distinguish and rarely discussed. This discussion between the two data types can hold light to the real reason of the war between two data holders: Facebook the Social Data Queen, and Google the Personal Data King
What does Google Friend Connect and others leading to?

Google Friend Connect or some newly emerging services aim to use one’s existing social network information in any website / service possible. As a web site owner (service provider), instead of creating a social service which might boost your
productivity, now you can use existing ones!
As a user, these services provide you to see what your friends are doing in those websites. For instance, which movie does your friend buy or which picture your friend is being viewed/posted.
But what is behind the scene? What does Google and others try to achieve? And why did Facebook, who had recently been critiqued because of Beacon service, exit from GFC?
Surely there is a contradiction when we talk about Facebook’s quit of GFC by claiming that Google gathers data without user’s permission. If you are to remember Facebook’s Beacon service, you might realize that both might have very similar background aims.
Let’s hear how those 2 firms present their products:
Google says:
“Google Friend Connect lets you grow traffic by easily adding social features to your website. With just a few snippets of code, you get more people engaging more deeply with your site.”
Facebook says:
Facebook Beacon:
“Enable your customers to share the actions they take on your website with their Facebook friends.
Facebook Beacon actions include purchasing a product, signing up for a service, adding an item to a wish list, and more”
While offering enhancements for web service owners and users, GFC and Facebook Beacon have the chance of gathering user behavior data from the websites in w
hich their “little code” has been added to. This extremely valuable data leads to the “recommendation” business can have direct financial outcomes.
What can be done by analyzing this information varies. They can improve the service quality of the websites that have been implemented to or they can use this information for their own purposes. By improving service quality, I mean that recommending those people the products/services that are most likely to be valuable for them. For instance:
- last.fm: bringing the music that you are probably going to like
- Amazon.com: recommending some products to buy.
- iletken (the ongoing project of mine) : providing news and blog posts to each user that they are most likely to be interested in.
We call this kind of content: Personally Relevant content
How do recommendation engines can help you improve service quality?
Recommendation engines provide personally relevant content to enable personalization.
Here is the explanation of Personally Relevant content from the presentation of project: iletken
Personally Relevant Content:
– Content, that the user finds valuable
– Delivered for each user separately
– System knows what content user wants
– Example: Out of a huge news database, only personally relevant content is brought for the user.
And here is an excellent quote from the ReadWriteWeb article: The Art, Science and Business of Recommendation Engines
“A good recommendation engine can make a difference not just for Netflix, but for any online business. This is because there are two fundamental activities online - Search and Browse. When a consumer knows exactly what she is looking for, she searches for it. But when she is not looking for anything specific, she browses. It is the browsing that holds the golden opportunity for a recommendation system, because the user is not focused on finding a specific thing - she is open to suggestions.”
Also if you are interested on this issue, you can also read the Article: Web 3.0: Is It about Personalization?
To sum up, relevant content that meets the relevant person is more valuable than randomly served contents. This personally relevant content can be served by recommendation/personalization engines and those engines are required to know about their users. The advantages of recommendation engines can be seen in many fields. Let’s not only think about web services: suppose that you are Wal-Mart. If you are to analyze and cluster your clients and if you are to find a little bit information and a pattern about what your users are going to be interested in, you can improve your sales and lower your storage costs by using relevant
recommendations, smart logistics and better storage algorithms. Actually, firms like Wal-Mart and others already do this but they probably do it intuitively. By years of experience, salesmen tend to know the trends and what each user is going to be interested in. Well, in the Web, this can be done in a smarter way.
This smarter way is what Google, Facebook, Last.fm, Strands, iletken and many others are competing for.
Is it something new?
If you have been to some data mining classes, you might remember a classical 10 year-old story of Microsoft who acquired a firm called Firefly in 1998
Firefly was built in Med MIT’s Media Lab and was aiming to bring personal content based on user behavior analysis. This service had become the first steps of Microsoft’s Passport service and was discontinued as a recommendation engine. The reasons for its failure are open to debate. These reasons probably include some privacy issues, but I believe the main reason was the lack of easily accessible data to feed this service at that time. Today, in Web 2.0 – 3.0 there are quite amount of data. That data is what Google, Facebook and others are fighting for. This is the modern Oil War of Web: Data wars.
Personal Data vs. Social Data
In order to understand the main motives of this war, we need to understand what the differences between Personal Data and Social Data are. We also need to understand why Social Data is valuable. Why does Google want to gather data from social networks that a user is registered to? And why does Facebook do not wish to share this information with Google?
An extremely simplified explanation:
Personal Data: The information about a particular user including his behavior data.
Social Data: Personal Data + Social information including friendship, network, etc…
These data types are required for recommendation engines to work. There are two main ways of how recommendation engines work:
- Item based recommendation
- Social recommendation
If you wish to learn the differences you can check the article on ReadWriteWeb
While personal data is required for Item based recommendation, social data is not. Social data is also not mandatory for social recommendation but it improves performance dramatically. Social recommendation is a recommendation type that based on the interaction analysis of users. This can be done without knowing the social connections of a person.
On the other hand, having social data provides many aspects that can help to improve a recommendation engine. Basic idea: people want to know what their friends are doing. I am not going to explain details of how this information can be used because it is a hot topic and some people, including me, working on some innovative ways to use this data. Check the last part of the post to have a glance about it.
Who is going to win in the War?
Both companies, Google and Facebook have advantages and disadvantages.
Facebook has the best social data available but most of the data it has is limited to the usage inside the Facebook. Facebook can not know how their users behave on other websites meaning that it cannot know what their users buy from Amazon. This is why Facebook tries to push its Beacon service and also, why it provides some services with itsapplication API that you can integrate into your website. (external application)
Google doesn’t really have the social data but it has great personal data. I know Google has Orkut, but Orkut does not have to popularity of Facebook and also, Facebook users create more valuable data to be analyzed when compared to Orkut. On the other hand, because of Google’s other services including Gmail and Search, Google has extreme amounts of Personal Data. I believe With Google Friend Connect, Google aims to fulfill its need for social data.
To sum up, I have talked about many issues regarding recommendation engines and the * oil required for these engines. This oil, called data, unlike real oil, tends to grow with time but there is another discussion which might limit our usage of data one day: privacy.
______________________
A little bit of iletken
If you have checked my 2 years old project iletken, you can see that the aim is the same: Use social connections to provide personally relevant content! iletken is a project based on the hybrid usage of different data types and balanced social connections. I am going to talk a little bit of iletken to give a bit of information for people who want to work on building or using a recommendation engine. I have to confess that I am also at the first steps of creating a recommendation engine and there is much I don’t know yet.
The prototype: News and blogs
We have built the first prototype of iletken as a graduation project to work on News that we gathered via RSS. We continued working on it as a total of 6 monts to turn it into a real product, which unfortunately didn’t happen.
The issues & opportunities: Video and Social Search
Creating recommendation engines for different types of contents are different jobs. While the main idea is the same, many aspects change as the content type change. For instance, understanding if one content is a duplicate of another content can be relatively easy and accurate if you are working on News, compared to videos because videos might require video processing. On the other hand, since news are considered as rapidly emerging contents and they are required to be served immediately, processing time and aging issues have to be taken into consideration.
One other thing to consider is the source of contents. If you are not Youtube or CNN, you need to get these contents from various sources. That also yields to some issues like the different organization and classification of contents. For instance, same news can be considered as a spots or humor news on different content sources. If you leave this organization to users, users might tag these contents differently. These are only some of the things that a development team has to take into consideration.
While creating the iletken project, we always had a vision of creating a search engine that takes into consideration of social connections. Considering: What people that are like you are searching for? But it is not quite easy as said. If someday, I’ll graduate and have a team which is not only 2 people
we might consider working on one.
———-
Sources:
http://www.readwriteweb.com/archives/recommendation_engines.php
http://www.readwriteweb.com/archives/web_30_is_it_about_personalization.php
http://www.readwriteweb.com/archives/recommendation_engines.php





September 3, 2008 at 12:04 pm
[...] I have previously discussed the importance of such data in the post: Social Data War , it is quite understandable why google deeply wishes people to share their data. Considering this [...]