A better way to search Connect
I recently spotted a comment from Microsoft on a Connect item with 13 total up-votes. The comment went something like, "wow, due to the explosive response to this issue, we're going to deal with it right away." Okay, it wasn't that emphatic, it was actually: "I've brought the MVP customer vote count to the attention of dev, and a new owner of this DMV says he will dig up some info for us." Still, knowing that I had seen other items with a much stronger response and barely a note of acknowledgment (never mind a pledge to actually act in any way), I performed a search. I started with a search for "SSMS" in my own feedback and was overloaded. So I used the advanced search to whittle it down. I tried over 100 votes and didn't get any. I started getting results (well, 1) when I scaled back to 70:
Then I scaled back to 50, and nothing significantly changed. When I scaled back to 30, though, I saw a really, well, interesting result:
Why is this interesting? Well, the item with 78 up-votes doesn't show up in a search for >= 70. It also doesn't show up in a search for >= 50. But when you lower the threshold to 30, it suddenly appears. Another interesting thing is that no matter what number I put into the vote count, items (such as #311079) with more than 100 votes never show up:
This is just the most recent bit of fascinating search logic I've discovered on Connect; there are many others over the past few years that have really made me shake my head. I've asked multiple times if we could get an API into the data to write our own searches, or be able to store backups of the data so that we can load them on our own systems and run our own searches. The response was always, predictably, "no, not cool." And they wonder why activity on Connect has dwindled (well, there are several reasons for that, which I may highlight in future blog posts).
Fast forward to last week, when Aaron Nelson (blog | twitter) and Nic Cain (blog | twitter) let me in on a little secret – they were loading Connect's SQL Server data into their own SQL Server database, via the RSS feed. Nic is starting a blog series about it, but the important points are:
- You can now get more direct access to the data. Rob Farley (blog | twitter), for example, has loaded up the Denali bugs into a pivot collection, allowing you to do lots of real-time visualization against a wide variety of filters. You can also click through to items that interest you.
- Aaron has posted a PowerShell script (you weren't expecting anything other than PowerShell, were you?) that will give you read-only access to the data, empowering you to write your own queries against it. I've already used it to run the following, where I was much happier with the results:
SELECT * FROM dbo.ConnectItems WHERE Author = 'aaronbertrand' AND UpVoteCount >= 30;
Note, though, that the data is not currently complete – the loading is still in process. So I'm not 100% happy with the results, but I know they'll get there.
If you want to stay on top of Connect items for SQL Server, I strongly recommend adding the new items and recently modified items feeds to your favorite RSS reader. In combination with easier searching, I find this quite useful.
Yup, you're right about the loading – it also needs to be refreshed continually. The key will be when Microsoft realise that they need to publish web services letting Connect be searched. That way, the data could be kept far more up-to-date.