Data
Online press articles
Online news articles were collected in advance to explore media reports and frames reflecting the social perception of schizophrenia patients. We specifically targeted the words of news articles reported by around 800 media companies, which were entirely provided by the massive Korean online search engine “Naver.com”, during a study period from 1 January 2005 to December 31, 2018. Naver.com is the Korean media brand with the highest nationwide usage rate of over 65%, and the average political leaning of users encompassing progressive, moderate, and conservative (17). As such, Naver.com’s comprehensive national usage and broad policy spectrum collectively help support the validity and reliability of its data resources in order to generalize the study findings. The search terms were “Jungshinbunyeolbyung” for mind-splitting disorder and “Johyeonbyung” for attunement disorder, and were automatically collected using Python. The dataset was divided into news articles discussing “Jungshinbunyeolbyung” as a mind-splitting disorder before and after the disease’s name change (from January 1, 2005 to December 31, 2010 and from January 1 2012 to December 31, 2018) and in press articles dealing with “Johyeonbyung”. » as attunement disorder after the disease name change (from January 1, 2012 to December 31, 2018) to study the differences in social perception between the disease names before and after the revision. Among the articles collected, articles that overlapped between different media reports were excluded from the analysis. As a result, the total number of articles used in the analysis was 2,743 (division of mind disorder before name change), 3,114 (division of mind disorder after name change ) and 3,068 (harmonization disorder).
Number of patients admitted to psychiatric hospitals with a diagnosis code of schizophrenia
Data for each month from January 2010 to July 2018 on the number of people admitted to psychiatric wards with a diagnosis of schizophrenia (ICD-10 code F20) were collected from the Korean Health Care Big Data System (http://opendata.hira.or.kr).
Media coverage of general crimes committed by schizophrenia patients
For media coverage, we only considered the three terrestrial television networks in Korea (KBS, MBC, and SBS), mainly because these networks exert more significant influence on audiences than other media types, as suggested previous studies. The number of news articles published monthly by each TV channel on general crimes committed by schizophrenia patients between January 2010 and July 2018 was calculated using the most popular news aggregator in Korea, Naver.com.
Analysis
Latent Dirichlet Allocation Topic Modeling
We analyzed the social perception of mental division disorder and attunement disorder before and after the renaming of schizophrenia, by collecting online news articles related to schizophrenia and applying various text mining techniques . First, the overall characteristics of online news articles were examined using latent Dirichlet allocation (LDA) topic modeling for macroscopic language analysis. LDA is the most widely used topic modeling technique that calculates the probability distribution of infeasible terms for each of the topic groups extracted from article collections (18,19,20). In this study, LDA topic modeling was carried out to investigate the difference between media topics related to disease names before and after the name change.
LDA topic modeling was performed on a dataset divided into news articles by time period (before/after disease name revision) and disease names (“Jungshinbunyeolbyung” for mental division disorder and “Johyeonbyung » for harmonization disorder). We tried using perflexity values to determine the number of analyzable subjects, but the values decreased monotonically across all sections. Thus, in this study, we determined the analyzable number of topics to be 30, which was considered appropriate to interpret due to the high similarity between the main keywords of the same topic and the low similarity between topics , after carrying out topic modeling by setting the number of topics to 10, 20, 30 and 40. The authors extracted 20 keywords per topic and annotated each of the topics based on the association between the keywords at within the subjects. To evaluate the classifications, two independent psychologists were invited to participate and they qualitatively confirmed the reliability of the suggested terms representing the different media frames (21). Inter-researcher agreement scores in loose and tight matches were calculated by dividing the number of topics consistently agreed upon by all topics. To further validate the inter-investigator agreement scores, we also adopted Krippendorff’s alpha (22) to correct for any potential bias resulting from redundancy in media images and the number of participating investigators. For analysis, the Gensim function (23) in Python modules was used and the results of LDA topic modeling were visualized with pyLDAvis.
Frequency weighting analysis of documents with inverse term frequency
For microscopic language analysis, the relationship and contextual features between articles were examined using the Term Frequency-Inverse Document Frequency (TF-IDF) weighting model (24). TF-IDF is a linguistic analysis approach to evaluate the importance of a word in an article for text mining. The larger the TF-IDF value, the more likely the word is to determine the topic of the article it belongs to, thus suggesting a measure for extracting key keywords (25). In the TF-IDF analysis process, the TF-IDF values of the top five words per article were calculated for each of the datasets. Then, words of varying meanings were ranked in descending order based on TF-IDF values, for which the first 20 words and the last 3 words were compared.
Quantitative epidemiological analysis
We investigated the effects of media coverage of crimes committed by schizophrenia patients on medical utilization patterns of patients with the illness nationally, analyzing epidemiological data according to a linear regression model. In order to see the relationship between the number of press articles and the evolution of the number of patients admitted to psychiatric services, we used the following regression model:
$$Daily\_patients\_change\_rate_t=\beta_0+\beta_1\,\#news\_articles_t+\beta_2\,\#news\_articles_{t-1}+\mu\,month\_effects+\eta\,year\_effects +e_{t},$$
Or.
Daily_patients_change_rate
t reflects the rate of change in the average number of people admitted to psychiatric wards with a diagnosis of schizophrenia per day during the month t compared to the previous month. To do this, the average number of daily patients was first calculated by dividing the number of monthly patients by the number of days in the month. Then the Daily_patients_change_rate
t was calculated as follows:
$$\left((\#\,of\,daily\_patientst-\#\,of\,daily\_patients_{t-1}\right)/\#\,of\,daily\_patients_{t-1 }\ast100$$
#Press articles
t refers to the number of news articles published by the three TV channels about general crimes committed by schizophrenia patients during the month t. We also included a variable representing this media coverage in the previous month (i.e. in the month t-1) due to possible delay effects in media coverage. Concerning effects_month, since it is likely that the number of people admitted to psychiatric wards varies across months, monthly effects were controlled for by incorporating monthly dummies into the regression model. Concerning year_effects, since it is possible that the value of the dependent variable increases over time, the effects of year were controlled by including year-related dummy variables. To further convince the importance of the results of the regression model, we simultaneously controlled for the effects of month and year using the two dummy variables.