The new study was intended to determine the validity of this method.
Нове дослідження мало на меті визначити валідність цього метода.
This article was exposed to the machine translation.
About every third person in the world uses the Internet, so it is not surprising that the data on its use can provide important statistical information. Given that about 5% of all queries in search engines are related to medicine and health care, these data can be used to analyze health status.
This idea is not new. Even the Google corporation (now Alphabet) had used search data to predict influenza outbreaks. Service «Google Flu Trends» had been lasted from 2005 to 2015 and is now only available as an archive of data (see.
One of the new areas of «Big Data» use lies in psychiatry. It is a prediction of suicide rates to assess the effectiveness of prevention interventions at the population level. The first work on the subjectwas published in 2010 and now this method and its validity is intensively studied.
New work on prediction of suicide was published in Aug. 16 in «PLoS ONE»
The study had relatively simple design. First were analyzed the real statistics on suicides, taken from the USA, Germany and Austria Databases for the years 2004 -2010. These are then compared with statistics of using terms related to suicide on Google during that period. Were analysed such queries as "suicide," "depression", "how to kill myself," "suicide online", etc. in multiple languages. Analysis of data from the use of such searches performed using
To assess the relationship between the data from search queries and real picture cross-correlation analysis was used. Total number of statistically significant cross-correlation coefficients, ie predicted and real data had matched, by country was as follows: United States - 9.96%, Germany - 2.29% Austria - 11.43%, Switzerland - 2.86%.
This means that the predictive capacity of suicide rates in case of Google Trends was rather small. The average significant cross-correlation coefficient was 8.34%. This is really only slightly above 5% - a level of 1st type error. In other words, only slightly more accurate prediction than from random.
To ensure objectivity in
* The more index is closer to 1, the more precise was prediction.
Research | Country | Method | Effect size, r * |
---|---|---|---|
Ma-Kellams et al., 2016 |
US | correlation and linear regression analysis | large (0.49-0.63) |
Gunn and Lester 2014 |
US | correlation analysis | medium to large (0.31 and 0.61) |
McCarthy, 2010 |
US | correlation analysis | large (0.70 and 0.50) |
Sueki, 2011 |
Japan's | cross-correlation analysis | medium to large (0.25-0.43) |
Yang et al, 2011 |
Taiwan | cross-correlation and linear regression analysis | medium to large (0.27-0.48) |
The author declare that no competing interests exist.