Last Wednesday Jon Gosier and Matthew Griffiths from the SwiftRiver team were at the iHub in Nairobi to present SwiftRiver 101. I could not have been more impressed with the work of the Swift team!
What exactly is SwiftRiver?
According to the website:
“SwiftRiver is a free and open source software platform that uses algorithms and crowdsourcing to validate and filter news.”
SwiftRiver is an initiative of Ushahidi Inc. The project is a response to the challenge of handling large amounts of small pieces of data with limited resources, particularly in crisis situations.
SwiftRiver automates some of the work of an administrator of an Ushahidi website (for example Haiti, Hatari or Voice of Kibera). The application automatically parses out the “who”, “what” and “where” of a short piece of text. The text could come from Twitter, a web form, email, SMS, news headline, etc. The platform is made up of the following components:
- SiLCC – Natural Language Processing for SMS and Twitter
- SULSa – Location Services
- SiCDS – Filters for duplicate information (for example exact re-tweets on Twitter)
- River ID – Establish the distributed reputation of an individual source (i.e. how reliable is the information I generate as an individual, from my phone, my Twitter account, my blog….and any other channel through which I submit information)
- Reverberations – Measures influence of online content
Not only does the Swift platform parse small pieces of text, but it also stores information about the reliability of different sources of information (see #4). This can serve as a way for Ushahidi administrators to decide whether or not to verify a piece of information coming from an individual source (based on their past history and reliability).
Crowdsourcing and Data Verification
A question that is often asked of Ushahidi deployments is “How do you verify your data?” and “How do you know the information is accurate?
These questions are essentially asking the question: is crowdsourced information reliable? The concept of crowdsourcing relies on information submitted from a dispersed network over time. You may not be able to decide how reliable one single text message or tweet is, however the strength of crowdsourcing lies in the collective wisdom of a group of people. The best known example of crowdsourcing is the online, user-generated Encyclopedia, Wikipedia. Although the accuracy of Wikipedia is constantly being challenged, this constant critique leads to improved content over time. In 2005, Nature magazine published a special report comparing a random sample of 42 scientific entries in Encyclopedia Britannica and Wikipedia. The author finds that there are
“numerous errors in both encyclopaedias, but among 42 entries tested, the difference in accuracy was not particularly great: the average science entry in Wikipedia contained around four inaccuracies; Britannica, about three.”
The author argues that the major advantage of Wikipedia is the ability to update and change entries quickly. This can be likened to near-real-time collection and publication of information through the Ushahidi platform. This near-real-time collection and publication also comes with the responsibility, particularly with controversial or sensitive issues, to have a team that is knowledgeable about an issue to read, possibly edit, approve and/or verify reports – SwiftRiver alone cannot do this job for you.
How do organizations deal with crowdsourced information?
It is the responsibility of each organization to develop standards, or procedures, or a policy based on their knowledge of the issue(s) they are monitoring (it’s up to you!). The Ushahidi platform allows you to APPROVE and/or VERIFY reports, which then show on the map as VERIFIED “YES” or VERIFIED “NO”.
A snap-shot of the administrative side of the platform is below. Note that you can approve but not verify a report, and you can indicate its reliability (reliability is not made public).
SwiftRiver as a standalone tool
SwiftRiver does not necessarily need to be plugged into the Ushahidi platform. The application itself can be used to track and store data from many different sources and store that information over time.
A number of use cases were discussed in one of the breakout sessions of SwiftRiver 101:
- Brand monitoring – a company or organization could set up the SwiftRiver platform to pull in keywords from Twitter and specific websites to monitor what people are saying about their product(s)or service(s)
- Disaster risk reduction – monitoring opinions and sentiments about certain issues in a specific geographic area over time. Indications of unrest may be apparent in the discourse, prompting intervention by responsible agencies.
Interested? Have an application for SwiftRiver?
SwiftRiver is currently available in a pre-beta version – Batuque v.0.20.
Download it from http://www.swift.ushahidi.com or view it here.
Learn more about SwiftRiver through the Swift River 101 slide show and the website.