MappyHealth mines twitter data looking for health term trends. It is hypothesized that social data could be a predictor to outbreaks of disease. We track disease terms and associated qualifiers to present these social trends. We have found that every term and condition trend tracked on our site has a band of “social noise”. This social noise is the everyday ebb and flow of tweets associated with a certain term. Spikes in volume and duration signal events that occur related to these terms. These events could be both positive and negative. MappyHealth seeks to foster awareness of these spikes through various mapping and analytical views.
We fetch real time data from twitter (via the streaming API) associated with the 234 terms we track. We then analyze each tweet to determine which of the 29 condition sets it matches. We then determine which of the qualifier words we track are contained within the tweets. This provides our visitors with the ability to see disease trends focused on certain topics such as vaccination, have, don’t have, etc. All of the trending, and filtering is accomplished via a series of algorithms and map-reduce jobs.
Once our analysis is complete, we present this information to the visitor in various ways, providing a versatile utility to visualize and research disease trends. Our information is global; however at this time we are tracking only tweeters who have a primary language of English. We will be adding additional primary language capabilities in the future.
We have analyzed 6271726 tweets. To provide a great user experience we have built the MappyHealth platform using HTML5, Ruby, and JSON. MappyHealth sits on top of a MongoDB database, the Amazon EC2 cloud and Heroukapp webserver, providing high reliability and faster queries than traditional sequel based platforms.
Try MappyHealth today on your PC, mobile device, or tablet!
Sensor based location tweets are gathered based on geo-location data provided with the tweet. These geo locations are associated with our map view and aggregated in several ways.
User profile location tweets are gathered based on profile location data entered into the Twitter profile of a user. This information is not geo location based. This information is not confirmed nor required by Twitter when a user sets up a profile.
We have found Twitter has millions of user profile locations. Some of these profile locations are erroneous and present false data. We have also found that most events associated with a disease trend produce hundreds if not thousands of tweets a day. If an event were to occur in a certain location that event would most likely produce similar results. Our user profile location statistics are based on locations with > 1,000 associated tweets in our data store. This helps us to filter out lower utilization locations while still producing data on a plethora of locations. On our profile location page, a user can search the top 500 locations for what’s trending.
We have found numerous examples of how trending and vigilance find events associated with the terms we track. These events are not just related to outbreaks but also include releases of reports, infection, awareness campaigns, and much more. Below we provide two examples of how a trend spike associated itself with an event. Had these events been outbreaks we would have captured them within the hour.
H5N1 and two event spikes We have seen three spikes for H5N1. The first was a 1950% spike on May 2nd associated with the release of a study finding H5N1 can evolve to the ability to spread among mammals. The second spike was also associated with this event. The third spike is related to a positive infection of a Crested Myna. The spike was predominantly tweeted in Malaysia.
Meningitis and a Lab Worker and 3 year old in South America A spike was noted between May 2nd and ay 3rd. This spike was attributed to two simultaneous events. The first a lab worker in California lab researcher who died after contracting meningitis and the second multiple tweets about a 3 year old girl who contracted meningitis in South America.
We have been collecting data since April 30th steadily. Currently we have 6271726 tweets in our database utilized to present the various trends on our site. To see how much data we are adding hourly to your ability to trend disease check out our all conditions trend graph.
We absolutely want your feedback. In fact our visitors will help us enhance this application. To provide feedback on our site or for any questions please email info@MappyHealth.com . Thank you for visiting MappyHealth.com.
For inquiries into how you could potentially partner or invest in MappyHealth.com please email info@MappyHealth.com