Here is a overview of a code to view distribution of the Addhealth dataset :
-----------begin--------- [[
# import library
import pandas
import numpy
# import .csv file
data = pandas.read_csv('addhealth_pds.csv', low_memory=False)
#setting variables to numeric
data['H1WP9'] = data['H1WP9'].convert_objects(convert_numeric=True)
data['H1WP10'] = data['H1WP10'].convert_objects(convert_numeric=True)
data['H1WP13'] = data['H1WP13'].convert_objects(convert_numeric=True)
data['H1WP14'] = data['H1WP14'].convert_objects(convert_numeric=True)
data['H1RE1'] = data['H1RE1'].convert_objects(convert_numeric=True)
data['H1RE5'] = data['H1RE5'].convert_objects(convert_numeric=True)
##### counts & percentages (i.e. frequency distributions) for each variable
# Section 16 : Relations with Parents
print ('###### H1WP9 : How close do you feel to your {MOTHER/ADOPTIVE MOTHER/ STEPMOTHER/ FOSTER MOTHER/etc.}? ')
print 'count :'
c1 = data['H1WP9'].value_counts(sort=False)
print (c1)
print 'percentages :'
p1 = data['H1WP9'].value_counts(sort=False, normalize=True)
print (p1)
print ('###### H1WP13 : How close do you feel to your {FATHER/ADOPTIVE FATHER/STEPFATHER/FOSTER FATHER/etc.}? ')
print 'count :'
c3 = data['H1WP13'].value_counts(sort=False)
print (c3)
print 'percentages :'
p3 = data['H1WP13'].value_counts(sort=False, normalize=True)
print (p3)
# Section 37 : Religion
print ('###### H1RE1 : What is your religion? ')
print 'count :'
c5 = data['H1RE1'].value_counts(sort=False)
print (c5)
print 'percentages :'
p5 = data['H1RE1'].value_counts(sort=False, normalize=True)
print (p5)
]]-----------end------------
Program output (for three variables)
And we got the following output :
-----------------begin------------[[
###### H1WP9 : How close do you feel to your {MOTHER/ADOPTIVE MOTHER/ STEPMOTHER/ FOSTER MOTHER/etc.}?
count :
4 1229
8 3
1 25
5 4239
2 156
6 2
3 480
7 370
dtype: int64
percentages :
4 0.188961
8 0.000461
1 0.003844
5 0.651753
2 0.023985
6 0.000308
3 0.073801
7 0.056888
dtype: float64
###### H1WP13 : How close do you feel to your {FATHER/ADOPTIVE FATHER/STEPFATHER/FOSTER FATHER/etc.}?
count :
4 1211
8 1
1 75
5 2467
2 184
6 4
3 610
7 1952
dtype: int64
percentages :
4 0.186193
8 0.000154
1 0.011531
5 0.379305
2 0.028290
6 0.000615
3 0.093788
7 0.300123
dtype: float64
###### H1RE1 : What is your religion?
count :
0 751
4 1590
8 95
12 80
16 192
20 1
24 5
28 182
96 25
1 27
5 569
9 8
13 236
17 134
21 27
25 25
2 64
6 9
10 67
14 370
18 17
22 1448
26 54
98 111
3 59
7 25
11 80
15 2
19 216
23 22
27 10
99 3
dtype: int64
percentages :
0 0.115467
4 0.244465
8 0.014606
12 0.012300
16 0.029520
20 0.000154
24 0.000769
28 0.027983
96 0.003844
1 0.004151
5 0.087485
9 0.001230
13 0.036285
17 0.020603
21 0.004151
25 0.003844
2 0.009840
6 0.001384
10 0.010301
14 0.056888
18 0.002614
22 0.222632
26 0.008303
98 0.017066
3 0.009071
7 0.003844
11 0.012300
15 0.000308
19 0.033210
23 0.003383
27 0.001538
99 0.000461
dtype: float64
]]-------------end----------
In the AddHealth survey, 6504 adolescents were asked :
How close do you feel to your {MOTHER/ADOPTIVE MOTHER/ STEPMOTHER/ FOSTER MOTHER/etc.}? [H1WP9]
How close do you feel to your {FATHER/ADOPTIVE FATHER/STEPFATHER/FOSTER FATHER/etc.}? [H1WP13]
What is your religion? [H1RE1]
For the 1st question above, 65.17% among them said "very much" (category 5), and we remark that 5.6% fell into category 7 (legitimate skip [no MOM]).
For the 2nd question, it was quite different because 37.93% only said "very much", and 30.01% has no Dad (legitimate skip)
With the 3rd question, the top of the list are Baptist (24.44%, category 4) and Catholic (22.26%, category 22). But we also found that 11.54% belong to the category 0 (none [skip to the next section])