Ophir, Y., Tikochinski, R., Asterhan, C. S., Sisso, I., & Reichart, R. (2020). Deep neural networks detect suicide risk from textual Facebook posts. Scientific Reports, 10(1), 1-10. https://doi.org/10.1038/s41598-020-73917-0
Introduction
Detection of suicide risk is a highly prioritized yet complicated task. Five decades of research have produced slightly better than chance.
In this study, Artificial Neural Network (ANN) models were constructed to predict suicide risk from the everyday language of social media users.
Methodology
The dataset was constituted by postings on Facebook alongside valid psychosocial information about the users.
Using Deep Contextualized Word Embeddings for text representation, two models were constructed: A Single Task Model (STM) to predict suicide risk from Facebook postings directly, and a Multi-Task Model (MTM) which included hierarchical and multilayered sets of theory-driven risk factors.
- STM
Facebook texts -> suicide
- MTM
Facebook texts -> personality traits -> psychosocial risks -> psychiatric disorders -> suicide
Result
Compared with STM, the MTM produced significantly improved prediction accuracy with substantially larger effect sizes. Subsequent content analyses suggested that predictions did not rely on explicit suicide-related themes but on a range of text features.
Conclusion
The finding suggests that machine learning-based analyses of everyday social media activity can improve suicide risk predictions and contribute to the development of practical detection tools.
What is the CWE embedding?
The study extracted representations of Facebook texts using a deep Contextualized Word Embeddings (CWE) algorithm. These text representations have served as the input for ANN models.
The ELMo model (Embeddings from Language Models) the study used is a type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). The word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus.
Learn more about ANN

An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain.
Various combinations and options for the MTM model
The study has constructed a model that has three theory-driven layers of contributing factors.
There could be various combinations and hierarchy options for layers of the model. What else?
Embedding Method
The study has used a pre-trained ELMo model (which is available at https://tfhub.dev/google/elmo/2) and extracted a 1024-dimensional embedding vector for each post in the data through mean-pooling over the contextualized word embeddings generated for the post. The overall textual activity of the user was represented as the average of its post vectors and served as an input to the ANN models.
Wouldn't there be a better way than just mean-pooling? Also, would there be a better embedding method available?
Incorporating theory-driven measures to machine learning
One of the major implications of the study is that it could incorporate theory-driven measures into the model and could strengthen the construct and external validity of the findings. Rather than using proxy diagnostic measures, the study relied on external, clinically valid measures of suicide risk. The inclusion of additional psycho-diagnostic tools (the personality, psychosocial, and psychiatric measures) also contributed to the validity of the study.
This idea could be even more improved by learning more about psychologically proven cues of MDD or suicidality and trying to apply them to the model.
Incorporating non-textual features into the prediction
One of the limitations of the study is that it focuses only on the language-based input to ANN models. Recent studies on depression detection indicated the superiority of textual content over other types of social network signals, such as length or timestamps of postings. Incorporation of non-textual social media activity features (e.g. pictures) could further improve the quality of suicide risk predictions.
Eichstaedt, J. C. et al. Facebook language predicts depression in medical records. Proc. Natl. Acad. Sci. 115, 11203–11208 (2018).
The idea of MTM, which uses hierarchical and multilayered sets of theory-driven risk factors is so interesting.
Also, using contextual embedding which has not been used in existing research is interesting.
The data of the study are available on request from the corresponding author [YO].
AUC scores can be converted to Cohen's d
AUC scores can be easily transformed into Cohen's d, the most common effect-size measure in experimental psychology.
Salgado, J. F. Transforming the area under the normal curve (AUC) into Cohen’sd, Pearson’s rpb, odds-ratio, and natural log odds-ratio: Two conversion tables. Eur. J. Psychol. Appl. Legal Context 10, 35–47 (2018).
Limitation of suicide prediction study
Judgments of suicidal thoughts or behaviors made based on textual content may not properly measure actual suicide (construct validity) due to self-representation biases and language ambiguities on social media.
Advantages of using ELMo for Embeddings
The study has used ELMo, which is a state-of-the-art method for embedding. ELMo extracted the representation of Facebook texts into 1024-dimensional vectors.
The use of ELMo has two major advantages over other text representation techniques, such as word count or N-grams. First, It is character-based (rather than word-based) and therefore it generates representations of non-words that are popular in social media language. Second, it generates representations of words within their context (i.e., a given word receives different representations depending on its surrounding text).
Word analysis after using ELMo Embeddings
Even though the character-based and high dimensional embedding method has been used, the study still could do the word analysis afterward.
The researchers applied TF-IDF analysis to detect specific words that distinguish between True Postitive and True Negative users. They extracted the 100 most characteristic words that best distinguished the given class from the rest.
Advantages of employing ANN models
Since many users refrain from sharing explicit depressive content, algorithms that rely solely on explicit distress-related content or established lexicons are arguably more susceptible to False Negative results.
In contrast, the current ANN models were capable of detecting subtler cues of mental health difficulties, even when users refrain from sharing explicit suicide-related content.
The advancement of deep neural networks in detecting suicide risk from textual Facebook posts is a promising development in mental health awareness and intervention. Leveraging AI to analyze language patterns can potentially provide timely support to those in need. Initiatives like this exemplify the intersection of technology and social responsibility. To explore innovative AI solutions, visit https://sanhe-machinery.com/.