Links, We Got Links

I used to post compilations of links to things I found to be interesting reading. I dropped the practice some time back and started bookmarking to del.icio.us and then later, Instapaper. The thing I don’t like about this approach is that the links end up kind of hidden in my sidebar widget and in comparing the click through rates for the bookmarks widget versus the old way of just writing a post with the links, clearly the links post had higher engagement rates.

Secondly, and perhaps more importantly, my Instapaper bookmarks are not exposed to site search, meaning I often can’t find things that I know I have bookmarked without going to Instapaper, while I have always used this blog as the place to compile note, links, whatever. Long story short, I’m going back to writing regular link posts.


If the Shoe Fits…

When I saw this on Drudge I thought the picture was related to the story, when in fact it was attached to a story about North Korea. There is something Orwellian about any mobile service that enables detailed information sharing with advertisers.. if a carrier wants to offer a free version of their service that linked to advertisers and behavior tracking, great but the fact remains that I pay a monthly service fee for my mobile phone service and I should not have to be subjected to intrusive behavior tracking.

The story is very clear that the tracking is related to web browsing and installed apps, so in many ways it is like what happens with cookies on the desktop web and there isn’t much of controversy about cookies anymore. The critical difference is that cookies don’t also report back what files I open or who I email, or at least it would be considered a gross violation of accepted practices to do so. What handset manufacturers and carriers are enabling is precisely such a violation by allowing apps to report to advertisers, for example, GPS coordinates.

This has the potential to be very troubling, but like many privacy issues I will reserve final judgement based on actual implementation details.

200903111041.jpg

TechFuga and News Aggregation

TechFuga launched a new version today and among the many improvements in the service are better clustering, search, and an interesting feature called “upcoming news” that attempts to surface news that is not yet popular but signals popularity movement.

200903110938.jpg

I like these services because they efficiently surface interesting news that is domain specific (e.g. tech, politics, sports, etc.). In many ways they represent a future for media as well because aggregation is demonstrating itself to be a bigger lever for publishers than organic traffic growth. You could almost say that aggregation is a perfect compliment to search, whereas search surfaces results based on specific keywords, aggregation surfaces results based on domain.

There is a legitimate debate about the legitimacy of purely machine driven aggregation. Gabe Rivera, who I consider a legitimate authority on this subject, explained this in great detail back in December when he announced that Techmeme would be augmenting their service with human editors.

The problem is context and relevance. Current mainstream aggregators determine this through extracting key words from unstructured text, building giant dictionaries that help derive context and then examining link patterns. Semantic search technologies go a step further by building triples that attempt to bring great context to unstructured text entities.

The problem with current relevance technologies is that entity extraction, done well, is hard and small errors magnify into gaping mismatches from the user perspective, there is the stale news problem where old news is resubmitted based on perceived relevance, and link patterns are prone to surfacing a lot of the stories that are just identical because they all trace back to the same source. Link analysis also doesn’t work very well in domains where there isn’t a lot of linking, like among food publications, which examples why aggregation sites tend to focus on technology, sports, celebrity news, and politics/current events.

Semantic technologies offer an appealing future but make no mistake about it, these technologies are demanding from a development standpoint. Freebase attempts to circumvent the semantic challenges by relying on existing data sets that have already been organized (e.g. wikipedia) and a community based approach that allows for massive data organization according to domains while not relying solely on a machine approach to the semantic problem; but Freebase is a database at it’s core that can be used by other services to build context, it is itself not a news or content aggregation site.

Despite all the challenges with news aggregation, the fact remains that this is a very logical approach for publishers and users alike. By clustering related content together and presenting it with an appealing user experience that moves content up/down based on popularity and relevancy, we all benefit.