MF hearts Data @mfdata - Tumblr Blog

2nd Week Assignment

This week, I experienced the genuine thrill of sitting through my first Python lesson -- and the disappointment of not being able to extract “clean” data from a very complex set of numbers. As explained last week, I was interested in examining the presence, if any, of a correlation between countries’ per-child spending at the primary school level and female literacy rates. I used two sets of data: primary-school expenditure per student as a percentage of total GDP, collected by the World Bank, and the literacy rate among females aged 15-24, as collected by the United Nations Educational, Scientific and Cultural Organization (UNESCO). Unfortunately, I realized too late that I didn’t pick a data set that would lend itself to a frequency distribution analysis as modeled in the videos. I’ll explain my struggles after showing my program (which just designates the two sets of data I’m working with and outputs their size):

Here’s the output:

The number of countries in the per-child spending data (209) is correct, so is the number of years under examination(1998 to 2007), if you remove one row. The number of countries in the female literacy table is also correct (209) but I ran into the problem of the program only recognizing ONE column for years, when in reality I adjusted the data to 2002-2011 (to accommodate scant information and reflect a four-year interval between investment and return). The problem with this type of data is that it is anything but neat. Typically, this is what the top of the female literacy table looks like:

My objective in this course was to be able to write a Python program that would reflect the INCREASE in both spending and literacy at the BEGINNING and the END of a set of years (here it would be 2008 and 2011 for Albania; 2002 and 2006 for Algeria) and compare the rate of increase of that variable (literacy) versus the rate of increase in educational spending. I realize that my Python code needs to be much more sophisticated than I was planning -- it will probably have to make use of a “first” and “last” function to allow for the difference per country; it will also need to make sure each row corresponds to the same country (since the data was collected by two different world bodies: the WB and UNESCO). In short, this first brush with code has been a humbling experience, and I am looking forward to coming up with ways to overcome the “ickiness” of these incomplete, fragmented real-world numbers.

#output:#exercise #data #bad data

Does public spending in primary education result in higher youth female literacy rates across the world?

I am entering Coursera’s data visualization course with the expectation it will help me understand, sort through and interpret data in areas that are of personal interest to me. For this course, I am choosing two sets of data made available by GapMinder, a Stockholm-based non-profit venture that promotes sustainable development through the use and understanding of statistics. GapMinder collects data from sources including the Institute for Health Metrics and Evaluation, US Census Bureau’s International Database, United Nations Statistics Division, and the World Bank. For the class, I will attempt to use two sets of data: primary-school expenditure per student as a percentage of total GDP, collected by the World Bank, and the literacy rate among females aged 15-24, as collected by the United Nations Educational, Scientific and Cultural Organization (UNESCO). My main hypothesis is that countries that demonstrate sustained or increased per-child spending at the primary school level will show increases in female literacy rates. While the hypothesis looks like a self-evidence at first glance, I am aware that there are numerous reasons why spending in primary education may not automatically result in higher female literacy levels -- a stagnant or declining DGP may have impacted overall education spending, with negative consequences on female literacy; overall spending levels may not necessarily correlate with increased numbers of female students in schools.

AVAILABLE RESEARCH: A quick survey of available literature on the subject turned up mostly country-specific studies: -- http://www.sciencedirect.com/science/article/pii/S0272775701000280 : Determinants of school attainment of boys and girls in Turkey: individual, household and community factors -- http://papers.ssrn.com/sol3/papers.cfm?abstract_id=650366 : What Factors Influence World Literacy? Is Africa Different? -- http://www.tandfonline.com/doi/abs/10.1080/10670560124330 : Inequality in Chinese Education

-- http://www.tandfonline.com/doi/abs/10.1080/13552070512331332273 : Gender equality and women's empowerment: A critical analysis of the third millennium development goal 1 Summary: The studies use a vast array of indicators to evaluate educational attainment. Study 4 uses variables such as Per capita GDP, Enrollment rate, Pupils per teacher, Education expenditure/GDP(percent), Years of schooling. In Study 3, the authors try to find the particular government policies that are responsible for exiting discrepancies in educational achievement. Study 2 attempts to shed light on the causes for the significantly lower level of academic attainment for girls, positing that understanding the constraints causing the large gender gap in Turkish education may be useful to policy makers and planners.

Trending Blogs

Recently Viewed Blogs

MF hearts Data