Archive for the 'social' Category

Wednesday, July 11th, 2007

OpenCI preparing to open up social network

Monday a week ago I visited Mediamatic on invitation from Willem Velthoven to talk about how they could fit in Portable Social Networks in their anyMeta system. This meeting was inspired by our meeting in Copenhagen and the talks we had about opening up social networks.

Picture by Matt Biddulph

anyMeta and OpenCI

Mediamatic.lab implements and maintains a series of social networking sites for the creative industries (CI) in Amsterdam. These are sites built on the anyMeta system that resemble structured wikis with a strong social dimension. They are positive towards open source, but the anyMeta system is not open source for reasons of manageability of the projects.

Seeing as that these sites have a lot of overlap in both in functionality and in the people that have an account on them, they wanted to abstract and syndicate the social stuff as much as possible. Currently people can have accounts on each of the different sites, all with the same information on them.

Seeing as Mediamatic builds anyMeta themselves and they have total control, it is very feasible for them to devise and mandate the exchange of information between their own sites. To enable the exchange between their own sites, they will use their own protocol and data format to provide for a high fidelity exchange of information. Leaving implementation details for what they are, it should become possible to use one account on any of the sites in the network.

To verify your identity on the various sites of the network they are going to enable OpenID consumer and provider functionality in the next version. This way they will have a way of distributed authentication both within their network of sites and throughout the rest of the internet.

anyMeta and the rest of the web

Microformats logo

Having solved the problem of information exchange between anyMeta sites, they would also like to play along with the rest of the internet as far as that is possible. Being able to share public information with the rest of the internet in a logical way is also on the agenda but not so straight forward.

Making public profile information available using hCard and related microformats looks easy enough. Problems arise however because the templates are made by different people and that is the location of the microformatted markup. This means the template authors have to be educated on the subject of microformats.

Whenever I advocate the use of microformats, I always have to fight against the blank looks and criticism about the aplicability of the technology. It’s a solid Catch 22 that has to be taken on with real life use cases and benefits to extoll the virtues of a dirty semantic web. For hCard there are various uses cropping up over the internet, but for the others it is a lot more limited. Having microformatted data on sites and being able to parse that using browser plugins is a first step and essential groundwork for the real use cases and richer interaction that we all want to have.

Another plan they have at Mediamatic is to first enable the sharing of information between their sites and make plugins for some of the bigger CMS’es out there (Drupal, Joomla) so they can also exchange information with those systems.

In these use cases and in the case with the internet the issue of fidelity comes up again and again. How much information can you exchange reliably and what do you do when stuff is missing? This is an important and valid question with no ready answer; though mine would be ‘get what you can, and ignore the holes where possible’.

Other stuff

Facebook logo

I am currently not implementing anything relating to OpenID and Social Networks but I think I would like to. One idea was to make a Facebook front-end site which uses the information in Facebook to offer you a microformatted profile. There already is an hCard application but extending this with XFN, hReview and hResume would be a real winner.

Yesterday on the O’Reilly event I heard about Yme Bosma who’s job it now is to drag Hyves kicking and screaming into the world of Open Social Networks. I wish him a lot of luck as that would be a good thing to have. I have started my own work on scraping the Hyves site but that hasn’t been as simple as I would have liked.

Monday, June 25th, 2007

Why most web start-ups don’t fly

Running a business isn’t too difficult if you respect some basic rules. Rule number one: you have to offer potential customers something they need. If you do the math well, exploit your network, practice some good marketing and have a bit of luck, you will probably succeed.

Nevertheless, 99% of all web start-ups die before they fly. That figure is higher than in any other industry. Why is that? Because most web start-ups don’t offer something customers actually need. Many people in the web 2.0 scene seem to disregard this.

I can see why. It’s relatively cheap to build web sites and with a potential worldwide market the prospects are extremely positive. Entrepreneurs, investors and enthusiasts - they all get carried away by the figures.

But if you don’t manage to tap into that worldwide market, it’s a whole different game. It’s not just bad for investors; it might blow up the industry once more (remember bubble 1.0?).

That’s why I believe we all need to be a bit more critical. Virtual communities might be the future of the web, but this doesn’t mean that any community will stand a chance, let alone be profitable. In the end, thinking of a good business model first is cheaper than just building web sites. It will pay off in the long run.

Wednesday, June 20th, 2007

The Future of Everything is Social: Consolidate and take back your social network

This is a delayed entry about a small session I held on Reboot about social networks. It ties in nicely to a recent series on Four Starters about trust and how friends are a solution to this.

In this article I will lay out why social networks are too important too leave in other people’s walled gardens and I will lay out a tentative way to connect the gardens and cultivate your own using microformats and other open standards.

Feedback is greatly appreciated.

The buddy list is key

Social networks and your online identity were prominent in various Reboot talks this year. I lifted Stowe Boyd’s quote: “The buddy list is the center of the universe.” (see slide). This has been true always but is ringing more so as the web matures and we are seeing the breakdown of centralized application models.

In a recent article, Dare Obasanjo wrote about the same subject and how he thinks that Facebook is going to be a big driver in this space. Maybe for America, but I don’t see this happening for Europe in the near future.
And does it seem like a good idea to have all the social information of the world under the control of one company?

But as Dare says, knowing who I know and who I trust, whether that information is in your address book or in your IM application, is usable in other contexts and can greatly improve the trust and interaction in those contexts. You apply the wisdom of crowds to a subset of people —the people you know— to circumvent the trust breakdown Reinier wrote about in current sites.

Yet another social network

My session was after that of Willem Velthoven (write up) who talked about their anyMeta social networking application. Their angle is to reduce the duplication of effort and enable sharing of data. Willem spoke about how he got quite sick of filling in his profile on every social networking site he wanted to participate in.

I have the same feeling and that is mainly what stopped me from filling in my Facebook profile, that and the fact that nobody I know is on Facebook. I could not bring myself to fill in another profile and try to get all my friends onto a new social network just because it looks like the next big thing.

Putting your profile information into these closed social networks gives them a lot of value but you rarely get the option of retrieving that information or using it elsewhere. The facebook API is an exception and is one way of getting access to data while it stays firmly in the silo. At what terms you can get at the data and what you can use it for is firmly in the hands of Facebook.

Microformats to the rescue

What I propose and what we talked about during my session is the concept of Portable Social Networks enabled by microformats. A social network that is open and readable and can be with you anywhere you want. During the Mediamatic session we discussed the use cases and the issues that would crop up and I invited people to attend my session for a discussion on the technical aspects.

We had a brief chat afterwards with interested people and after a break I was joined by some people among whom Willem Velthoven and Jeremy Keith (his post) to talk further about the technical stuff. I got the impression that some of the people attending my session were happy —maybe relieved even— that there was also a technical session to be found on Reboot.


Photograph by Jeremy Keith

Photograph by Tijs Teulings

I will go into deeper technical detail in a following post but the concept is to use microformats to markup most of the information found on a typical Myspace, Facebook or Hyves (the Netherlands’ most prominent social network) profile page. This way you can either get the data out from supporting social applications or hook into the network by hosting your own identity web page with correctly formatted data.

So what can we do with technology we already have (POSH+microformats)? As it would seem, a lot:

hCard

Your hCard can contain most of your personal information including your personal details, your address and your picture. This is equivalent to the personal details and picture which are usually listed on any given social network.

XFN

Most social networks have a prominently visible list of friends and a count of your total number of friends. This is very easy to implement using XFN. You can markup the links to the people in your network with the rel=”" metadata which XFN defines. XFN allows you to hook into the network from anywhere. So an XFN link from suppose Hyves could also lead to a profile page on another social network or a self-hosted one. Or at least, that is the vision.

This way you can also differentiate between trust levels withouth using numeric values which Reinier also talked about would be necessary. I am bound to trust a friend more than a contact and you could derive more relations like that.

You could also link to other sites where you have a profile or store data such as your Flickr account, your del.icio.us bookmarks or any other site. The rel="me" value could be used for this, but it is required to be symmetric, so those other sites would have to link back using the same rel="me".

hResume

A lot of social networks also allow you to markup your current job, sometimes your previous places of employment as well and in many cases also your school history so you can get in touch with former school friends.

This information is very similar to the information you can markup using hResume. You could only present the information you want to share in a casual social network but you might want to enter a full hResume to provide all the functionality of professional social networks such as LinkedIn or Xing.

hReview

Most social networking sites and even Flickr have you keep lists of your hobbies, favorite music, movies and books. You could easily mark these up as hReviews and have the fn be an URL to a generally known catalog for that item. For books I would say something such as Librarything or Amazon (with an associate ID!), for music maybe Last.fm and for movies probably IMDb.

This also solves the problem Willem suggested could arise when we use different (language) titles for the same object. Just link them all to the same uniquely identifying resource.

OpenID

OpenID logo

The microformatted information listed above can be on any page you want, in fact it probably already is if you have a Flickr or a Twitter account. I do think that there is a case to be made for linking this to your OpenID.

OpenID solidifies the notion of identity online and creates a place for everything to come together. It gives you a URL with which you can refer to a person and you can be reasonably sure that the person and the URL belong together.

Applications that will want to use this kind of information will probably already ask for your login credentials. Those credentials could very well be an OpenID from where on you could automatically retrieve a load of information as described above.

Consolidating or delegating

The question is: do you host this information yourself or do you have someone else do it for you? Since I already host my own OpenID at http://alper.nl, I have already started by embedding an hCard there and I am building the MySpace-esque portal page with all my information on there (preview).

Not everybody will want to host this themselves, but it is analogous to OpenID. Anybody can self-host their OpenID but they can also use a hosted version and the same for the providers. If at any time you want to switch OpenID providers, just change the reference.

The various internet sites which host your profile information such as the social networks and the profile sites such as MyBlogLog or 30boxes need only to mark up their essential data with these standards for it to become instantly accesible and portable.

Even for those hosting this information themselves, it would be nice to have some sort of interface beyond editing the HTML yourself. For instance it would be nice to have an ‘Add as friend’ button on your own site which would ask for your permission to add somebody to your XFN list.

Completeness and Clients

You can make the markup as rich as you want or you can leave stuff you don’t want to share out. Adoption and convention is completely up to you. There are advantages in adhering to a certain standard, but smart clients should be able to deal with holes in this picture.

Below are some use cases which can be realized today already and will also work if not all of the information is present.

Use case: Get an Avatar for somebody

I already talked about this in a previous article (“OpenAvatar - Combining OpenID and hCard”). This concept is just an extension which loads more data onto the page. If a page —such as an OpenID page— contains an hCard with an associated picture, you can retrieve it.


My avatar retrieved from my Flickr profile page

I already wrote a parser as a webservice which takes a URL and returns the associated picture. This parser can take either my OpenID or my Flickr profile (which contains an hCard). This way you can get an avatar for someone that they can manage and update to their own liking.

This concept had already been brainstormed on the microformats wiki.

Use case: Get registration information

A lot of information you need to fill in once during registration such as your full name, date of birth and some other stuff can be gleaned from the hCard. The site getsatisfaction.com already offers to scrape this information from an hCard supporting profile when signing up, saving you the trouble to fill it in.

Flickr lets users list their preferences in music, literature and cinema on their profile page. This listing could be marked up as an hReview with a rating of 1.0 on a worst/best scale of -1.0 to 1.0 (like is 1.0, dislike is -1.0). Then I could reuse it on all the social networking sites that want to know my favourite movies.

Use case: Find out who I trust

The stuff Dare and Reinier talked about with building trust networks and using that information can be realized by walking the XFN web.

Any site imaginable can be improved by adding the knowledge of my network. Imagine IMDb which shows you which movies your friends have recently watched. Or anything really, and all this without having to add your friends on every such network.

Wouldn’t that be a dream?

Monday, June 18th, 2007

Reboot - Willem Velthoven on OpenCI

(This post got stuck somewhere in the queue. Tomorrow my longer post about Portable Social Networks.)

Willem Velthoven was scheduled to speak about anyMeta on the second day. I was already familiar with Mediamatic and their work and Willem had already contacted me about their social networking offering but his talk provided a lot of insight anyway.

Willem on the left (picture by Julian Bleeckr)

I did not know that Mediamatic is significantly in the social networking business. They seem to implement a great number of them on top of their standard anyMeta platform. Having done this a number of times, they began to wonder if they could abstract away the commonalities to reduce the duplication of effort. Willem talked about how his personal information is duplicated on a great number of websites and how this gets tiring.

AnyMeta is also the system as it has been used for the Reboot.dk website for before, during and after the event itself. It is a structured wiki which takes some getting used to but I think is quite rich in functionality. The only thing I am missing right now is a fine grained setting to receive notifications from the system.

Willem also talked about the API which any anyMeta site exposes at a standardized URL and which provides hooks to do pretty much anything you would want to with the site.

OpenID support both ways —by which I think he means both being an OpenID, accepting OpenID logon and being an OpenID provider— is supposed to be forthcoming.

In his talk Willem outlined the use cases he envisioned a networked social networks should accommodate and what the problems would be that come up with that. He was also very curious if other people had already started doing the same so that no effort would be duplicated.

I had registered myself on the Reboot site to host a conversation about the technical aspects of implementing social networks using OpenID. I mentioned that this could be a great follow-up to Willem’s talk to first talk about the need and the use cases for an open social networking system and then talk about the technical means we already have to our disposal to realise such.

See the following post on my talk.

Sunday, June 17th, 2007

All Transactions are based on Trust - Part 3

The series finale. (previously: intro, part 1 and part 2)

Part 3: Future of Trust: Ponderings on the future of the social web

In parts 1 and 2 we’ve created a pretty sweet hypothetical article recommendation engine based on networks of trust relations.

That was merely an example; almost everything you do on the web involves trust. Consider the following current internet practices that really need some sort of trust web to solve a bunch of defects:

  • eBay needs more trust. Not just the general “Can I trust this guy to actually deliver what he’s selling?”, but even simpler, what if you could reduce a buyer/seller’s feedback score to only the feedback given by your trust network?
  • Receiving e-mail: What if your spam filter could take into account the trust relation between you and the email sender? A respectable company would be able to get their form emails easily past your spam filter, and any companies that do engage in spam will see real repercussion and cost: Massive loss of trust, undermining any future endeavours. If the trust vectors are interconnected, this loss of trust hurts them on the entire web, not just on email.
  • Blog commenting: Almost analogous to receiving email —no more need for akismet. Knowing the trustability of a server operator also helps directly in cutting down on linkjacking and shill blogging (Trust #3 in the reddit/digg/delicious analysis: Can I trust the host of the linked article to be honest is satisfied with such a system!)

You can come up with similarly elegant fixes to just about everything you do on the web.

Trust is universal

The world is your oyster

The one problem with setting all web services up to work with webs of trust is that it’s annoying to upload your list of friends to all these webservers. Optimally you really want a single site/space/page where you can drop your list of people you trust, and let all other services —your email provider, digg, reddit, del.icio.us, eBay, your blog software, your web browser, your flickr account, etc.— simply read out your trust web from there.

There are 2 separate movements underway to help out in this regard.

Facebook and Open web APIs
A number of disjointed web platforms already are aware of (some of) your web of trust. For example, your average ‘social network’ server (facebook, MySpace, Hyves, etc) knows about your friends, your friends’ friends, etcetera. In theory at least you trust your friends at least somewhat. Facebook has made the bold first move of making it relatively easy to ‘surf’ this network of trust, which should make it possible for other sites to simply glean your trust network from there instead of re-inventing the wheel.
Hopefully other services which have a part of the network of trust relations will open up their services as well. For example, mutual email conversations —you both sending and receiving— is a pretty good indicator of trust. Some crafty database queries on the gmail server could produce a very useful web of trust graph. The blogs you have in your RSS feed are also a (usually) positive reflection on the amount of trust you have for a given user in this case the author/operator of the sites behind those feeds.

Opportunity

Interesting startup idea here - or probably more likely a lucrative opportunity for an existing social network service, like facebook: You leave your username and password details of a number of web services, to get a heuristic attempt at recreating your complete trust web. Because the indicators I named above sometimes might be wrong, this site should also offer a simple way to give someone an explicit positive or negative review.

The biggest challenge here is simply realizing that two accounts on two different web services belong to the same person. Not all web services use email, and most people have more than one e-mail address.

OpenID
The OpenID movement is taking a more distributed approach to the problem. We at Four Starters have written lots about OpenID, but the basic gist is simply the ability to store all the information you usually need to fill in to register at sites (username, password, email, home address, website, thumbnail foto, etcetera) on one server, so that you can then allow other websites to simply ask that server for the information. The ‘OpenID’ server, upon getting a request for any sort of information —including just authenticating that you are you— then asks you to identify yourself. The upshot is that only your OpenID provider even needs to know a password. For all other sites you simply enter your OpenID —which is a URL. Mine is http://reinier.zwitserloot.com/ for example.

OpenID Logo

The amount of data you can put on the OpenID server is extensible; it doesn’t have to be limited to just the usual name, email, address information. You could stuff your trusted contacts in your OpenID database as well —a list of OpenIDs combined with a trust percentage. This system solves the problem of linking identities that the aggregate existing services plan listed above suffers from.

This is really the solution Dick Hardt seems to be talking about in his world famous Identity 2.0 presentation. I had the good fortune to see an extended and updated version of it live at The Next Web 2007 where he presented.

I’d love to delve deep into what needs to happen to the web to make this solution work, but I couldn’t possibly do as good a job at it as Dick’s presentation, so I will simply suggest you watch it, if you haven’t already.

Maintaining the trust web

Wrench

As Cristiano wrote yesterday, and as Deborah Schultz talks about in her presentation, there are gradations of friends. There’s a parallel here to trust: There’s also a gradation of trust. Some people I trust almost completely, others I only trust a little bit. Just like friends, these levels are also dynamic —sometimes trust (and friendship) waters down over time, sometimes you make new friends, or learn to trust new people and sometimes someone does something to lose your trust.

Because it’s important to keep your web of trust updated, the idea of letting each site run its own little web of trust doesn’t scale very well. Centralizing your web of trust into a single repository is crucial to making this vision of the web a reality. It also means that this trust relation thing really needs to be a read/write proposal: It must be possible for me, optimally speaking, to very very quickly downgrade or upgrade an individual’s trust percentage in reaction to for example getting screwed on/satisfactorily completing a transaction with someone on eBay. There’s no good reason why OpenID (or the facebook API) can’t be extended in such a way as to make this possible.

While it can be argued that trust is dependent on the type of action. For example I trust my baker to make me a nice pie much more than e.g. some of my friends who can’t cook for beans. I doubt this is needed. After all, I DO trust my friends not to try and saddle me up with a nasty tasting pie I don’t actually want.

Security: Hurdles ahead

Unfortunately it is now time to delve into the issues that will have to be solved before this is going to work.

Hurdle

Primarily, there is identity theft. It’s already a big problem now, but with trust webs, getting your identity jacked is even more of an issue. Lots of spam is already sent from compromised computers. It’s a small leap to go from there to also jacking that user’s OpenID login, so that the spam software can add itself as a trusted resource, or, alternatively, to just identify itself as you. Either way, everyone who trusts the user with a compromised computer now also trusts the spammer. It doesn’t even have to involve keylogging. The world doesn’t change in day, in practice we’ll be stuck with old services using user/pass based login for decades. Random users are very likely to use the same password there as they do for their OpenID provider. In effect we create a single point of failure by centralizing identity in this way.

One solution is to not use a password to identify for OpenID. Instead, use a ‘shape password’ (the act of drawing a little image), or a ‘visual password’ (the act of picking an image or a series of images out of a large set of them). By aggregating all the user/pass stuff into a single page, it is possible to be a little more thorough and intelligent about the way this site verifies your identity. Another option is to use hardware, like a USB key, to serve as authentication device.

Still, none of these solutions are completely impervious to security leaks of some sort. As Bruce Schneier explains, in general security products tend to suck. Designing for failure is going to be necessary.

I don’t really have the answer here, unfortunately. Brighter minds will have to crack this nut.

Going the distance

A couple of web-based services would be made possible with such a centralized web of trust that currently aren’t really feasible. Just to really dig deep into the possibilities, imagine a political system based on this web of trust. Instead of electing a representative based solely on ideas, you elect on trust - basically on the idea that a given individual will be honest and integral about representing you. If a system exists to anonymously inform your representative about your preferences, the representative will then have to filter and interpret the spirit of his constituents’ opinions. Attempts to pander to company lobbyists, or to go too far against the opinions of those who voted such an individual into power should lead to a loss of trust, which will prevent re-election, or, preferably, at some point just means he is ‘fired’ from his job as representative the moment his trust level drops too far.

Vote but better

Friday, June 15th, 2007

Why Friendships are Money

Just as most of you I have been following Reinier’s posts on trust with a lot of interest. Especially his theory on the fact that “a single Reddit vote has been reduced to 0% trustworthiness” has really inspired me to think about online communities and the relationships that they supposedly forge. It got me thinking of the problem that Deborah Schultz presented about her having 3000 relations (not friends), and the fact that the average (!) person on MySpace has 30 ‘friends’. I just can’t believe that all these people are friends!

Online FriendshipI don’t have 3000 friends, mainly because I simply don’t have the time to maintain that many friendships. Maintaining a friendship in real life takes time, and because my time is limited it is fairly impossible to maintain an unlimited amount of friends. Even more, a real friendship has a certain minimum threshold of time that is required to maintain the relationship, meaning that maintaining even 300 friends is something I consider impossible. So why are there so many people in online networks claiming to have 3000 friends?! There is a clear disconnect here between the meaning of on-line and off-line friendship.

The problem with current online social networks is that it’s a bit too easy to make people your friends. This leads to the ability to make “friendships” with a theoretically unlimited amount of people, and the only way to maintain this is to lower the value of a friendship. Much like with money and products, the more there is of it the lower the value will become. The online social networks, in their attempt to please their users, have therefore decreased the value of friendship to a virtual zero.

Old FriendsSo lets take these networks back to the drawing board and have a look at what makes offline friendships valuable. As I stated before it takes time to manage and maintain friendships, and therefore you don’t have that many real friends. Even better: as you move through life your and move house, change school, or switch jobs you will be losing certain friends and gaining others. Losing contact with “old friends” can be a painful thing as it is a lost investment in the time you spent to maintain the relationship.

The point of this story is that creating and maintaining a friendship costs time, and as time is money to most of us we can conclude that friendship is a really valuable investment: in other words friendships are money! And since online communities with their boolean relationships (== every relationship is as valuable as the other) offer you unlimited friendships we have now come to a online society where these friendships don’t offer any value.

The task for current and future social websites is to start creating networks that are about the quality of your relations and not the quantity. It might be reasonable to say that having online friends in a network should take time. In fact this is actually already happening as I for example have quite a few “friends” in my Facebook that I never talk to and don’t consider friends, but maybe we can take this one step further and have an online friendship dissolve to “zero” in time when the interactivity between the “friends” weakens.

140199641_l.jpgIf a network would implement this loss, it would create a real incentive to invest time in your friendships. Again, as time is money you are actually investing money in your friendship, and maybe it is an idea to offer the option some people to pay for their friendships when they don’t have enough time (although I think we are going in a weird direction here if we allow people to “buy their friends”, as people are already ‘buying’ friendship now by posing half nude or making an idiot out of themselves just to rack up MySpace friends).

We can conclude that having online friendships should be a valuable investment. Obviously this investment can be good for yourself as you create a network of trust and therefore increase the value of the group. It is clear that ‘declaring’ someone as a friend just doesn’t count as being a true friendship, and it actually devalues the concept of friendship. I am looking forward to the next generation of online social networks that will be focusing on trust, friendship and non-boolean relationships. There sure is a lot of work to be done, but I am an optimist so I am sure some bright mind (Reinier?, Alper?) will already be thinking about some possible implementations.

Thursday, June 14th, 2007

All Transactions are based on Trust - part 2

The series continues. (previously: intro and part 1)

Part 2: Analysing a trust-aware internet transaction: del.icio.us network

In Part 1, we analysed a typical transaction on reddit, an article aggregator with the principal function of recommending you interesting articles to read when you are bored or just in need of some news.

Today, we look at a service with a very similar premise - del.icio.us network. While the premise is exactly the same (give you a list of articles which might be interesting), and while the basic notion is similar (a disconnected set of people basically ‘vote’ on stories), the actual implementation is completely different. Specifically, the way trust is interweaved into the the network feature compared to reddit’s system is entirely different.

Where reddit seems to actively try to eliminate trust as a factor (it is for example impossible to see who votes for what, only comments and submissions can be found, though not easily) - del.icio.us network works solely on trust relationships.

Let’s revisit the same transaction of part 1, but this time with del.icio.us network.

del.icio.us recommending me something to read

I go to my del.icio.us/network page. I will need to trust the operators of del.icio.us, which can be problematic, as del.icio.us is owned by Yahoo, a business. Businesses, in theory, have no morals. Fortunately in practice I can take off my paranoia hat and trust healthy competition - google does not point me to any convincing evidence that Yahoo is trying to surreptiously hawk political views or allow unmarked advertising. I’ll trust this site - enough, at least, to let it recommend articles to me.

The network page is a lot like any reddit page - a bunch of articles, some with very obvious descriptions, some less so. There’s some extra fluff (total number of del.icio.us users who bookmarked a given article, and a tag list). While potentially interesting, from a trust point of view this information is just as useless as reddit’s article score.

The next issue of trust is, for each article that appears here, if I can actually trust that I should give it my due attention. This is where del.icio.us/network differs from reddit and digg: An article is on that page ONLY because one of my direct connections thought it was sufficiently cool to bookmark it. I trust those people I manually add to my delicious network. Thus I can directly trust the articles that show up on my network page. The exact mechanism of trust is left to the user; I may trust one of my network contacts because they are my friend. I may trust someone else because I like his blog and the articles he links to there seem interesting. Regardless of why I trust my network contacts - the point is that I personally trust them.

The final step of trust is - once I decide to read an article, can I trust that the operators of the server that hosts the article are trustworthy? This step is also much more adequately addressed: One of my personally trusted contacts saw fit to go through the trouble of bookmarking it. At least a modicum of due diligence has probably been applied.

From a trust point of view then, del.icio.us/network is on the up and up. There is no problem here - trust-wise, this system will not collapse under the weight of its own popularity. Of some schmoe manages to sign up for a del.icio.us account and starts bookmarking spam, tripe, and drivel, I don’t even notice.

London Eye

Basically, my network is a wheel: I’m at the center, with all my connections arranged around me, feeding article recommendations to me.

There’s even a responsibility system built in: If one of the users in my network keeps bookmarking crappy articles, I can remove them. One common problem with responsibility (a.k.a. karma systems - scores for users) is that the trust issue isn’t addressed at all: The karma of any given user is again determined by untrustable, unaccountable masses. Removing someone from recommending articles to you completely is much more effective from a trust point of view.

Trust is neccessary… but not sufficient

Unfortunately, though, just because you built a system that maintains trust in the transaction, doesn’t mean your idea is any good.

Some problems with del.icio.us:

  • Traffic - once you run out of articles, there are no more. On reddit and digg, there are always more stories to read because the pool of submitters is much larger.
  • GroupThink - If all the users in your network read the same blogs, work in the same area, and have the same thoughts, your network is very unlikely to bring you new ideas in new topics, or well written arguments for viewpoints you do not hold. In practice large communities suffer just as much (Digg and Reddit have of late sported front pages where every single article is either extolling the virtues of one Ron Paul, presidential candidate for the 2008 elections in the United States of America, or taking the mickey out of George W. Bush).
  • Rating - While on reddit each article has a score and thus you can sort them, on del.icio.us an article is either on your network page, or it isn’t. Once your network produces more articles than you can handle, there is no way to prioritize them usefully.

Fortunately, trust can help us out here, if you apply some more of it to del.icio.us network. None of the steps I’m going to explain here have been implemented by del.icio.us yet. It would make for a much better experience if they would.

A wheel does not a network make!

By acknowledging that a network is more than just a wheel with spokes, these problems can be addressed!

In the ‘wheel’ view of a del.icio.us/network, I can actually check out the networks of friends, check out people THEY have deemed fit to add to their network, check out what those people have been posting, and if I like it, add it to my network. That’s one way of solving a dearth of articles: Just add more people to the network.

So, instead of a wheel, I can treat delicious as a connected network:

Social Network

There’s really no reason why this can’t be done automatically. Anytime I’m out of articles, so to speak, it should be possible to just say: Go to the ‘next layer’ - give me articles recommended by friends of my friends. Trust is more or less multiplicative, after all: If I trust Jack, and Jack trusts Joe (I don’t know Joe), I can trust Joe to some extent. Once 2 layers no longer give me enough articles, I can go to a third layer, ad nauseam.

We can solve the other problems in a similar fashion, but a more holistic approach solves them all.

First, we establish a scoring system on a per-article basis, dependent on the network. The network of del.icio.us basically consists of users, connected to each other (each connection represents someone being in the network of someone else). Now add the articles themselves to this network: Anytime I bookmark an article, I am connected to the article directly. Anytime a friend of mine bookmarks it, I’m connected to it through my friend.

It is of course possible that I’m connected to an article in a number of ways. A friend of a friend bookmarked it, a colleague’s brother’s girlfriend’s classmate bookmarked it, and one of the bloggers read by someone whose opinions I admire bookmarked it, for example. In the network this is represented by the network by having 3 different ‘paths’ I can take to arrive at the article.

These paths can be distilled into one final personalized score. Each connection takes a chunk of 80% out of the total score - so a friend’s friend’s friend, 3 steps, is .8 * .8 * .8 = 0.512 in total score. For multiple different paths, you can’t just sum them up (or you could end up with a score above 100%), but there are a number of algorithms (naively: of all paths, take the highest scoring, divide by 2. Take the next highest scoring, divide by 4. Take the third highest scoring, divide it by 8, ad nauseam, then add them all up. This number can never exceed 100%. Another way of doing this is to consider each link in the network as a resistor in an electric circuit. Multiple resistors placed in a series multiply their resisting effects and thus reduce current. However, multiple resistors placed in parallel lessen the effect, but, whatever you do, you can never get more power out than you put in. Now replace resistors with links on the network and you have an algorithm!)

This scoring/recommendation algorithm can even be extended to del.icio.us users: The score of a user is then entirely dependent on how well he’s connected to your own network (though relying too much on this can lead to GroupThink!).

Such a system solves all 3 problems. To wit:

Traffic

Research in social networks finds that usually social networks are virtually completely connected. There’s a path from any one person to any other. Thus, it’s possible to derive a score for every article and you can just keep reading indefinitely, though, of course, as you keep reading, each further article has a lower score.

There’s some excellent research by GustavoG on the social network of Flickr (also a web app that allows you to set up a network of friends). Very pretty pictures of tightly interwoven networks, such as this one:


Flickr’s demographics in January 2005. Click on the image for the full story and more graph images.

Rating

As already explained, any given article is no longer a simple yes/no proposal: Articles recommended by a number of your direct friends rate highly. Articles only recommended by one distant link (A friend’s friend’s friend, and that’s it) rate lowly.

GroupThink

This is where it gets very interesting. Because everyone builds their own unique community, GroupThink is no longer a virtual guarantee. For example, on digg or reddit, if a well written article that happends to put a ‘taboo’ topic in a good light (like Java, Microsoft, George Bush, traditional media, and a few others), or a ‘holy’ topic in a bad light (Ruby on Rails, web2.0, digg/reddit itself, Apple, Linux, and a few others), chances are very high it gets drowned out in the noise of the crowd. Even if all the people whose judgement I actually trust did vote it up, I never see it. Contrast this your own unique community, where articles at least have a chance.

There are two forms of GroupThink: Accidental and intentional. On both Digg and Reddit, you occasionally see a post imploring to put an end to the flood of the latest meme-of-the-day posts. Ironically these also get voted up with some frequence. Clearly then not all GroupThink is actually desired by those experiencing it. In a social network this GroupThink is eliminated; you can simply hunt down which elements in your network are fielding the majority of an onslaught of a certain meme, and toss them from your network or at least lower your level of trust in them.

The other type is intentional: Where a reader actually wants to read more about the same topic over and over again. There’s not all that much to be done; trying to force reading other things onto such a person is tantamount to censure and very hard to distinguish from forced propaganda.

In practice, in real life, GroupThink is somewhat rare, because you have friends from many places. Colleagues, family, old school buddies - friends of people you’ve dated that you kept in contact with, etcetera. If these real life bonds also exist in your del.icio.us network, ostensibly the chance of GroupThink is much reduced.

I could be wrong, but a system like that sounds like the ultimate source of articles. As much or as little as you want to read, resilient to GroupThink, nearly impossible to spam, and ever evolving to your tastes. Unfortunately, as far as I know, nothing quite like it exists just yet.

… or does it? The remarkable quality of the early phase

A version of this ultimate article recommendation engine did exist, briefly.

reddit itself, meets this system! At least, it did, in the first few months after the launch. The users of reddit back then amounted to a single connected social network. A number of important features weren’t there (all votes are equal instead of being attenuated by the distance in ‘friend links’ from you, for example), but on the whole this was it. If you happend to use reddit in those days, or you know someone who has (I fortunately managed to catch the tail end of those days), you may hear about or remember the amazing quality of articles.

This idea actually can be observed in many budding social networks. For a little while, Orkut (google’s ‘myspace’) was a trove of excellent networking opportunities. This was back when Orkut required very scarce invites.

Invites are an excellent way to keep the size of a social network into the efficient phase as long as you can, but of course it does restrict growth - by its very definition that’s how it manages to keep the efficiency of the social network high. In fact, a number of more or less ’secret’ smaller social networks that work on invites and a strong sense of responsibility (a misbehaving user gets kicked, and the one who invited the abuser also gets kicked!) have been running strong for years. The one problem with that tactic is that it can’t scale.

A trust network can!

The final part 3 will be posted the day after tomorrow (Friday evening). In it, expanding this idea to other walks of the web and of life in general, the importance of identity in such a trust-bound world, and how Identity 2.0 and open APIs are the beginning of a brave new world. As an encore, part 3 will also briefly discuss a problem I’ve so far omitted: Doing all these scoring calculations is computationally speaking extremely difficult. spoiler: There’s a way out of it, more or less!

To continue reading, go to part 3.

Tuesday, June 12th, 2007

All Transactions are based on Trust - part 1

As promised, today part one of a series on trust.

Part 1: Analysing a typical web transaction: The Reddit Breakdown

Flashback to a year ago. Reddit is relatively new and has a limited but very active userbase.

I go to reddit.com. I will need to trust the site which implies I need to trust its operators, as it is impossible to trust a computer (they do what they are told, without questioning orders, hence it’s folly to trust a machine implicitly). This will be referred to as Trust #1.

Trust #2. I see an article on the front page, with 100 votes. I need to trust those who have voted that this actually means it’s a good — I basically need to trust that this score number has any meaning.

Trust #3. I follow the link and read the story. I trust that the story doesn’t lie and that any further action I take, like bookmarking it, or recommending it to others, won’t get me any surprises (I’ll need to trust the site author that he didn’t e.g. linkjack the content if I’m going to share it with others, for example).

In the early days of reddit, all 3 forms of trust are more or less met to my satisfaction. Here’s a break down:

For #1: The mere fact that Paul Graham recommends these guys is good enough; I trust Paul Graham. Not very much, I don’t know him personally, but he has a lot of reputation to lose if he recommends a bunch of swindlers. Thus I trust Paul Graham’s judgement enough for me to be satisfied here. Note here that it’s possible to trust someone purely on what they have to lose.

This is an example of trust-by-chaining (I trust the operators of reddit because Paul Graham trusts them, and I trust Paul Graham) and trust-by-buy-in (By recommending them, Paul Graham has effectively placed money (the value of his reputation) on the table which he will lose if reddit is swindling my time by e.g. making crappy articles look highly rated for cash. Paul Graham has a certain level of buy-in to this recommendation). He could be wrong - but trust doesn’t need to be perfect.

For #2: This is where it gets interesting. I trust the votes (back then) because reddit was only known to those ‘in the know’ - the first redditors were personal friends of Graham and the authors of reddit, and had personal buyin not to screw it up for their friends. The userbase then exploded outwards like a viral infection but elitism kept the quality high for quite a while. Those who just post lolcats all day and ‘abuse’ the site by downvoting well written insightful articles that don’t happen to coincide exactly with a voter’s viewpoints, for example, didn’t happen, because the vast majority of the redditors, by mere virtue of being so in the loop that they knew about reddit in the first place, don’t do that sort of thing. There’s also the issue of intent: There’s very little to gain by gaming the site. Unfortunately, trolls and social rejects exist, but as a rule there are far less negative influences if there’s nothing to be had. Back then there the user base was too small and too new and thus flew under spammer’s radar.

This is an example of trust-by-chaining, but without me actually seeing the chains: I trust the authors to only have friends they recommend reddit to who are known by them to have a modicum of nettiquette. This type of trust isn’t very ’strong’, but I don’t need much just to accept a recommendation to read an interesting article. Note that trust is multiplicative: If I trust Jack for 80%, and Jack trusts Joe for 80%, I can trust complete stranger Joe for 64%.

You may realize at this point that the ‘trust from elitism’ argument no longer applies to reddit, nor to digg - they are too famous now. It also explains why almost all ‘open’ social systems, where every user’s vote has an effect, start off stellarly well and always drop off. From kuro5hin, to digg, to reddit.

For #3: This is a bigger problem. The only practical ‘proof’ you have for the majority of links (specifically: Every link to an article on a site that you aren’t familiar with) being trustworthy is the personal recommendation of the original submitter. I don’t trust the voters enough to expect them to have done due diligence on the trustworthiness of the operator of the linked article. For the same reasons as #2, in general this trust was at least satisfied to some extent in the early days.

Fast forward a year.

Reddit is now so famous, the trustworthiness percentage of any one vote has dropped to absolute 0. Not 0.00001, there’s nothing left. The value of a normal redditor’s vote is extremely low, and some of the redditors are actively abusing the system, voting their own blogs up just for the traffic, posting their own linkjacked material, voting other stories down just so that their own has a better shot, etcetera. These cancel with the very low value of the vote of an unknown internet user, resulting in a value of absolute 0. The value of a vote is also not negative (which would mean I could just read the most negatively rated articles!) because the same scammers would force the equilibrium back to 0 if reading the most lowly rated articles ever became a useful way to use reddit.

lottery

A trust level of 0 is the only trust level which is utterly worthless. The information available to me for any given recommendation from reddit is: The ’score’ (upvotes - downvotes) + the username of the one that posted the article. The practical value of knowing that 150 more reddit users thought article X was good enough to vote it up versus down, coupled with the fact that user ‘foobar’ thought it was good enough to post to reddit, assuming I don’t know ‘foobar’, is valueless. I might as well pick a completely random web link to read - it’s a lottery.

There is no such thing as a ‘wisdom of the crowds’ unless you meet the stringent requirements for this: No attempts to screw up the system (or those attempts are symmetric and thus cancel out), and no practical way for mob mentality to form - no way for each individual of the crowd to be influenced by the rest of the crowd. Reddit and digg definitely do not meet the qualifications and thus there is no trust to be found by knowing “a bunch of” random people’s opinions happend to coalesce. Hence: The recommendation has no value. It is worthless. Practically, this will probably manifest itself as sucky submissions, and this is in fact exactly what’s going on. The function has changed - it’s a rolling window on the current meme of the web, no longer a site that recommends interesting articles.

There are ways out of this dilemma. Specifically: If I did trust the user ‘foobar’ directly, for example because I remember that his submissions have been excellent so far, I’m satisfied for Trust #2 and Trust #3 and I can go read the article (A form of trust by past performance - inductive reasoning is behind the assumption that he will continue to do so. It’s certainly not worth 100% trustworthiness, but it’s enough for article recommendations). Unfortunately that completely goes against the idea of popularity aggregators and as expected both digg and reddit make it very hard for you to work in this manner. There’s no easy way to mark someone as a ‘friend’, for example.

A site which actually works almost entirely on that principle (articles recommended by people you already trust) is del.icio.us, the online bookmark service. You can add people as ‘friends’ and watch the stuff they bookmark in your inbox. While each link does list the # of random users who also bookmarked that link, no amount of ‘votes’ (in the sense that bookmarking is a vote) will make a story appear in my del.icio.us inbox until someone in my personal circle of friends (obviously, people I trust) personally bookmarks it, thus allowing transfer of trust: I trust my friend, he apparently trusts the link he just bookmarked, and thus I trust the link.

Lots of startup ideas I hear about base themselves on the notion that the opinion of random unknown people has an intrinsic value. The problem is, in the early stages of a startup, they do, because the people from whom you cull the opinion aren’t actually unknowns: They are tied to you by your viral marketing scheme. Your startup is doomed to fail unless you can manage to toss some form of trust in there, for example by allowing users to reduce the site experience to just those people they personally trust, or by explicitly staying small.

To continue reading, go to part 2.

Sunday, June 10th, 2007

All Transactions are based on trust. The web is no exception.

The rapid degeneration of the quality of the posts on digg, reddit, and other aggregators highlights a serious misunderstanding prevalent amongst lots of web2 startups.

All Transactions are based on trust. Even trivial transactions, like someone (or something) recommending a site for you to read (reddit, digg, delicious’s inbox, your flickr friends stream, your RSS feeds, your homepage, google search results). Misunderstanding this leads to websites that get killed by their own success.

I’ll explain how Trust works on the web and how you can build webservices that don’t get killed by their own success by keeping trust a central part of the system in a series of posts.

Post 1: I’ll first give an extensive use case to show how trust permeates all transactions, highlighting why any social network with no notion of an actual ‘network of trust’ behind it cannot become famous without swiftly plummeting to abysmal quality as well - I’ll analyse a ‘transaction’ of me going to reddit (it’s a lot like digg if you’re more familiar with that), and break it down into atomic little bits of trust.

Post 2: A similar service, yet from a trust point of view very different: del.icio.us’s inbox system. I’ll analyse it in the same vein, and then propose a way to expand it to scale in traffic and ease of use without compromising its implicit trust system.

Post 3: A missive on the significance of the Facebook API, how OpenID and the concept of ‘identity 2.0’ can also help us out, and one view of the future of the web and society in general.

This series of posts has been inspired in part by Deborah Schultz‘ presentation at The Next Web 2007. It’s 31 slides with lots of pictures to look at. Won’t take you more than 5 minutes to click through, and it sets this series up quite well.

I’ll post the first part of the series tomorrow, so in the mean time, here’s her presentation:

To continue reading, go to part 1.

Wednesday, May 16th, 2007

Blogwalk Eleven Amsterdam - Digital Bohemians

This Friday we will be having a blogwalk in Amsterdam (the eleventh such edition) and I am happy to be participating.

(more…)