This post is for the week 3 assignment of the Coursera course Data Management and Visualization by Wesleyan University.
Instructions for this assignment:
STEP 1: Create graphs of your variables one at a time (univariate graphs).
STEP 2: Create a graph showing the association between your explanatory and response variables (bivariate graph).
The explanatory (independent) variable:
- INTERVALGROUP (categorical variable, 3 categories)
- NUMSH (categorical variable, 2 categories).
The response (dependent) variable:
- S4AQ4A16 (categorical variable, 2 categories)
What do these variables mean?
INTERVALGROUP=”Three Groups with Different Time Interval Between First Episode & First Sought Help”
NUMSH=”Sought Help Once, Twice or More”
More info: Research Project (3)
Based on the codes from the previous week, I added these codes to display the graphs:
The univariate graphs of three variables:
The first graph is unimodal, with its highest peak at the lowest category of 0 time interval. It seems to be skewed to the right.
The second graph is unimodal, with a higher percent of people who sought help for only one time (57%) compared to those who sought help for twice or more times (43%).
The third graph is unimodal, with a higher pencent of people who didn’t attempt suicide (88%) compared to those who did (12%).
*Note: Responses were from those who had major depression at the time of interview.
History of suicide attempt BY time interval groups & number of help-seeking times:
Recoded Values for the response variable:
0=no, didn’t attempt suicide; 1= yes, attempted suicide
Therefore, the higher the y value is, the greater percent of people in that category attempted suicide.
Among people with major depression, suicide attempts seemed to have no association with time interval between age of onset of major depression and age of help-seeking, but those who sought help for two or more times appeared to have a higher risk of attempting suicide compared to those who sought help for only once.