Blog

April 2009Archives

Since the Scout Labs application began, we’ve maintained a rapid pace of development for new features. These new features make the application more powerful and usable, but only if they are bug-free. And since the nature of computer programming makes bugs inevitable, we have comprehensive tests to help us catch those bugs and correct them.

Once a new feature or change to an existing feature is completed and deployed to our staging environment, the next step is to write new test scripts to ensure the new functionality behaves as expected. All the older scripts are executed as well to regression test the existing features to make sure they are still functional.

In addition to the Java and Ruby on Rails tests, we do end-user testing. For this we use Selenium Remote Control, an open-source web app testing tool that works in multiple browsers and operating systems. It allows us to test multiple versions of browsers in different operating systems, including Internet Explorer 6 & 7, Firefox 2 & 3 on Windows XP, Windows Vista, Mac OSX Tiger, and Mac OSX Leopard. Selenium also supports Safari; we’ll be adding test environments with that browser soon.

When the Selenium Remote Control (SRC) server is running, it accepts commands from a script to automate browser actions. Anything from clicking a button or link to parsing the HTML on a page can be automated.

selenium_running.jpg

Using Selenium and a driver for Ruby, we have created a comprehensive test suite that simulates an end-user browsing the Scout Labs app (albeit at super speed). This way, we can test the functionality of the whole site, from simple to complex:

  • Simple: when the app is used to send a blog post email, it should be sent immediately with the correct content.
  • Complex: when a user creates a bookmark on a blog post, photo, video, twitter, or comment entry, that bookmark has the correct tags, the correct create date, can only be deleted by the author, and shows up in the correct places (on the entry detail page, on the search result page of the corresponding search, in the bookmark section, under the activity feed of the homepage, and in all other members of that workspace).

selenium_results.png

If the test suite finds a problem, major or minor, a developer fixes it and the test is rerun. Only when all tests have passed on all browsers in all operating systems is the new version promoted to the public site. This ensures we catch as many bugs as possible so you have a smoother experience in the application.

Of course, no test suite is perfect and Selenium can’t catch all the bugs. If something does break, you will see an error submission form; please fill it out and we will fix it as soon as we can!

Last night we pushed some new features that users were VERY vociferous about wanting:

  • New data in the application: Blog Comments
  • Export of data: Now you can download a list of your bookmarks, or .csv or .html files of your blog results
  • Custom date ranges: You can set date ranges for blog data, so that you can see results from a particular time period

New data in the application: Blog Comments

Now you can see comment mentions for blogs, the same as you can for Twitter, Photos, Videos, Blogs, etc. This is hugely helpful when the main blog post does not contain a reference to your search, for example oDesk or Motorola or Dippin’ Dots, but the comments on the post do.

Export of Data: CSV and HTML

You’ve always been able to download the source data for graphs as a .csv or a .png. This new feature enables you to get a list of links with their content in either .csv or .html format, so you can work with them offline, excerpt from them for reports, or just read them through without having to click on anything (those were the three main use cases customers told us about). After you decide to download and choose your file format, the application emails you a link where you can pick up the data. Typically this takes less than 5 minutes. The limit per export file is 1,000 results at a time, but if you really want to read all 60,000 Coke mentions for the last 6 months, you can do it in tandem with the custom date range feature.

New Feature: Export of Data

Custom Date Ranges: Set any range within the last 6 months

We’ve been showing you data within a 24H, 1W, etc range based on today’s date. Now you can decide what date you want the date range to start or end from. So if you want to see data only from February 15 to March 15, select 1M ending on 3/15/09 and you’ll get blog mentions published within that time frame only. Choose “center” on a specific date to see all the posts around a newsworthy or other buzz generating event; for instance if you know that Web 2.0 was from March 31st to April 3rd and want to know what Jeremiah was doing around that time, center the search around April 1st and choose a date range.

New Feature: Custom Date Ranges

There are a couple of other minor enhancements, like cooler buttons that show state and other nifty Javascript enhancements, but those are better experienced than described. Coming up we have greatly expanded Twitter data, so that you can get graphable trends back a couple months, use full Boolean search on Twitter content, use date ranges, get sentiment and see frequent words on Twitter content. We are also working on an amazing feature that uses NLP techniques to extract customer quotes from user generated content. You’ll be able to see what people love, or hate, or recommend or wish for about a product or brand or company or whatever you are searching for. It’s SO COOL, we can’t wait to show it to you! Have fun and stay tuned.

As Margaret described in her earlier post about Sentiment, our sentiment scores agree with humans about 75% of the time, right out of the gate.

“Is that good enough?” people ask us. “It depends,” is the answer.

It is extremely powerful to have the system score hundreds of thousands of posts in real-time so that you can be alerted to potentially volatile issues and situations without doing any special scoring work yourself. For example, when Netflix logged in earlier this month, (or maybe when they opened their daily email alert from Scout Labs) they saw this data:

Picture 68.png

Pretty clear that customers were not happy about something on Monday March 30th. Drilling in, Netflix could quickly see that the decision to increase the price of its Blu-Ray DVD rentals was not a popular one! We correctly scored lots of unhappy posts.

Picture 69.png

We didn’t score every post correctly. I found one post that we scored as negative, based on the following phrase: “Netflix blamed the company for…” We thought it meant that Netflix the company was being blamed, but really it was Netflix doing the blaming. But in aggregate, Netflix gets a very good sense in real-time what customers are thinking, thanks to Scout Labs’ automated sentiment scoring.

And when you care enough to make sentiment the very best, you can always correct or override the sentiment values in the application, which updates the value only for you and your team. If you are an agency (especially PR) this is part of the value that you can provide to your client: getting the sentiment values JUST right. There’s an added benefit to scoring / improving sentiment, as the team at Scout Labs uses those overrides as labeled data sets that we can use to improve our sentiment accuracy rate over time.

So is sentiment scoring at Scout Labs perfect? No. Is it still incredibly powerful and useful in identifying problems? Yes. It acts as an early-warning system and brings very important problems to your attention.

Today we have two new features available in the application: six months of data and agency co-branding of workspaces.

Agency co-branding is the ability to add a second logo to the workspace header, with custom text that will appear on the homepage and on click for the second logo. Agencies have been asking for the ability to co-brand the application for clients that they share a workspace with, like so:

Picture 32.png

This feature will support the addition of any second logo to a workspace, for instance the iPhone group within Apple, or the Basketball team within Nike. Go to the “Settings” tab and note that there are now two slots for uploaded images plus a agency info/custom text field. Just make sure you hit “refresh” after you upload new images to see your new assets in the header. It is possible to format text using basic HTML or the Markdown syntax such as * for italics or ** for bold and to include links in the text field. More detail on how to format the agency info area in our Support section.

Six months of blog data is also now available. We are hearing from all of you that two+ years of data is optimal, to do year over year trending and get a little more history, so expect more data to be made available soon. Sentiment is available only for the past three months. Six month graphs are also available with one caveat: because we are only supporting sentiment on the last 3 months of data, six month sentiment view may flat line more than three months back. If you get more, consider it the gift of an idle 8-core processor!

Picture 34.png

Next up: Blog comments, one of the two remaining must-have data types (the other is message boards, which won’t be available until summer). Also customizable date ranges for viewing blog data and export of data, both of which have been repeatedly requested by users who want to tailor the view within the application and work with data outside it.

In terms of other major features, we are working hard on full Twitter data, including graphs and frequent word lists. You’ll be able to plug in a search and use more exact search parameters than are supported by search.twitter.com and get trendable graph data, which no one else has. We’ll definitely let you know as soon as that one is ready!