



id gender ses schtyp prog  write 
70 male low public general  52 
121 female middle public vocation 68 
86 male high public general  33 
141 male high public vocation 63  
172 male middle public academic 47 
113 male middle public academic 44 
50 male middle public general  59 
11 male middle public academic 34  
84 male middle public general  57  
48 male middle public academic 57  
75 male middle public vocation 60  
60 male middle public academic 57 


import csv 
import numpy 
csv_file_object=csv.reader(open('scores.csv', 'rU')) #reads file 
header=csv_file_object.next() #skips header 
data=[] #loads data into array for processing 
for row in csv_file_object: 

#asks for inputs 
gender=raw_input('Enter gender [male/female]: ') 
schtyp=raw_input('Enter school type [public/private]: ') 
ses=raw_input('Enter socioeconomic status [low/middle/high]: ') 
prog=raw_input('Enter program status [general/vocation/academic: ') 

#makes them lower case and strings 

我所缺少的是如何筛选,只得到统计为特定的组。例如,假设我输入了男性,公众,中级和学术 - 我想要获得该子集的平均写作分数。我尝试了来自熊猫的groupby功能,但是这只能让你获得广泛群体的统计数据(例如公共vs私人)。我也尝试了熊猫的DataFrame,但是这只能让我过滤一个输入,并不确定如何获得写作分数。任何提示将不胜感激!


# Subsetting by using True/False: 
subset = itu['CntryName'] == 'Albania' # returns True/False values 
itu[subset] # returns 1x144 DataFrame of only data for Albania 
itu[itu['CntryName'] == 'Albania'] # one-line command, equivalent to the above two lines 

# Pandas has many built-in functions like .isin() to provide params to filter on  
itu[itu.cntrycode.isin(['USA','FRA'])] # returns where itu['cntrycode'] is 'USA' or 'FRA' 
itu[itu.year.isin([2000,2001,2002])] # Returns all of itu for only years 2000-2002 
# Advanced subsetting can include logical operations: 
itu[itu.cntrycode.isin(['USA','FRA']) & itu.year.isin([2000,2001,2002])] # Both of above at same time 

# Use .loc with two elements to simultaneously select by row/index & column: 
itu.loc[['USA','BHS'], ['CntryName', 'Year']] 
itu.iloc[[204, 13], [0, 1]] 

# Can do many operations at once, but this reduces "readability" of the code 
itu[itu.cntrycode.isin(['USA','FRA']) & 
    itu.year.isin([2000,2001,2002])].loc[:, ['cntrycode','cntryname','year','mpen','fpen']] 

# Finally, if you're comfortable with using map() and list comprehensions, 
you can do some advanced subsetting that includes evaluations & functions 
to determine what elements you want to select from the whole, such as all 
countries whose name begins with "United": 
criterion = itu['CntryName'].map(lambda x: x.startswith('United')) 
itu[criterion]['CntryName'] # gives us UAE, UK, & US 

import pandas as pd 
data = pd.read_csv('fileName.txt', delim_whitespace=True) 

#get all of the male students 
data[data['gender'] == 'male'] 

