I'm trying to create a function in Python that uses dataframes from two other CSV files to determine population.
My Python file looks like this:
import numpy as npimport pandas as pddf = pd.read_csv("mm_copy.csv", index_col=False) # NB "Index_col = False" is necessary because otherwise Python tries to use first column as index, \ #and throws everything out by one columncountries_df = pd.read_csv("countries.csv") south_america_df = countries_df.loc[countries_df.Continent == "South America"]south_american_countries_list = south_america_df.Entity.tolist() # Creates complete list of South American countries, using local definitionsprint(south_american_countries_list)population_df = pd.read_csv("population.csv", low_memory=False)population_df.set_index("Location")def calculate_crude_maternal_mortality(year): # Establishes total WHO female population of South America who_female_pop_slice = population_df.loc[(population_df.Location == "South America") & (population_df.Time == year)] who_female_pop = who_female_pop_slice["TPopulationFemale1July"] #Calc total South American Female population in relevant year total_south_american_female_pop = [] for country in south_american_countries_list: # iterates through list of South American countries if population_df["Location"].str.contains(country).any(): # checks country is in population dataframe population_df_slice = population_df.loc[(population_df.Location == country) & (population_df.Time == year)] # locates appropriate row which has country and year country_female_pop = population_df_slice["TPopulationFemale1July"] # finds cell with female population total_south_american_female_pop.append(country_female_pop) # adds this population to the totals list print(total_south_american_female_pop) # To add - adding hte values up within the list print("Total South American female population, using WHO definitions = " + str(who_female_pop))calculate_crude_maternal_mortality(2015)
The countries_df section is fine, and does indeed create a list of the requred countries. I created it in this way to allow myself to iterate over it country by country.
An excerpt from the population.csv file (and therefore dataframe) looks like this:
SortOrder,LocID,Notes,ISO3_code,ISO2_code,SDMX_code,LocTypeID,LocTypeName,ParentID,Location,VarID,Variant,Time,TPopulation1Jan,TPopulation1July,TPopulationMale1July,TPopulationFemale1July,PopDensity,PopSexRatio,MedianAgePop,NatChange,NatChangeRT,PopChange,PopGrowthRate,DoublingTime,Births,Births1519,CBR,TFR,NRR,MAC,SRB,Deaths,DeathsMale,DeathsFemale,CDR,LEx,LExMale,LExFemale,LE15,LE15Male,LE15Female,LE65,LE65Male,LE65Female,LE80,LE80Male,LE80Female,InfantDeaths,IMR,LBsurvivingAge1,Under5Deaths,Q5,Q0040,Q0040Male,Q0040Female,Q0060,Q0060Male,Q0060Female,Q1550,Q1550Male,Q1550Female,Q1560,Q1560Male,Q1560Female,NetMigrations,CNMR247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2012,45550.12,45782.417,22632.42,23149.997,40.743,97.7642,27.2575,520.504,11.362,464.594,1.015,68.2904,748.734,163.722,16.344,1.9297,0.92,26.376,104.5,228.23,130.084,98.146,4.982,75.5973,72.4547,78.7871,62.1301,59.0969,65.1879,17.4558,16.0588,18.7051,7.1826,6.3333,7.824,10.868,14.5137,739.171,12.888,17.1791,61.6454,86.0621,36.5196,133.7882,174.8247,92.1336,66.6632,96.237,36.622,115.7444,155.5635,75.5928,-55.913,-1.221247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2013,46014.714,46237.93,22850.386,23387.544,41.1484,97.7032,27.6756,511.117,11.047,446.432,0.966,71.7544,744.381,159.336,16.088,1.9049,0.9087,26.405,104.5,233.264,132.443,100.821,5.041,75.8267,72.7253,78.9723,62.3146,59.3192,65.3327,17.5411,16.1093,18.8282,7.2328,6.3674,7.8916,10.44,14.0247,735.204,12.359,16.581,59.6473,82.8304,35.7795,130.9528,170.3958,90.9944,64.6886,92.7506,36.1904,113.4276,151.6658,74.9431,-64.686,-1.398247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2014,46461.146,46677.947,23060.58,23617.367,41.54,97.6425,28.0974,501.117,10.728,433.602,0.929,74.6122,739.615,152.064,15.834,1.883,0.8986,26.509,104.5,238.498,134.812,103.686,5.106,76.0427,72.9851,79.1389,62.4916,59.5363,65.4649,17.637,16.185,18.9449,7.285,6.4182,7.9467,10.055,13.5942,730.785,11.893,16.0617,57.8828,79.9829,35.12,128.4596,166.4828,90.0037,62.9794,89.7482,35.799,111.3895,148.2313,74.3682,-67.516,-1.445247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2015,46894.748,47119.728,23272.374,23847.355,41.9331,97.5889,28.5193,491.031,10.417,449.96,0.955,72.5809,734.664,144.665,15.585,1.8627,0.8893,26.64,104.5,243.633,136.964,106.669,5.168,76.2573,73.2516,79.2916,62.6672,59.7603,65.5831,17.7591,16.3109,19.0601,7.3437,6.4971,7.9874,9.675,13.1695,726.176,11.428,15.5387,56.3993,77.6037,34.5513,126.3584,163.1564,89.1868,61.7138,87.4564,35.5774,109.7502,145.3989,73.9664,-41.064,-0.871247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2016,47344.708,47625.955,23517.966,24107.989,42.3836,97.5526,28.9314,482.508,10.14,562.495,1.181,58.6915,730.565,140.179,15.353,1.8435,0.8806,26.686,104.5,248.057,138.68,109.378,5.213,76.4713,73.5187,79.4398,62.8419,59.9837,65.6973,17.9043,16.4802,19.1741,7.4081,6.5985,8.0166,9.316,12.7494,722.402,10.99,15.0258,55.193,75.7559,33.9994,124.5565,160.4191,88.3461,60.8743,85.9499,35.4094,108.4201,143.1839,73.5314,79.978,1.681247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2017,47907.203,48351.67,23874.571,24477.1,43.0295,97.5384,29.3149,473.319,9.832,888.935,1.839,37.6915,726.008,135.564,15.081,1.8178,0.8684,26.75,104.5,252.689,140.463,112.226,5.249,76.6456,73.7592,79.5348,62.9901,60.1914,65.7739,18.0382,16.6428,19.2726,7.469,6.6958,8.0421,8.998,12.3818,718.136,10.637,14.615,54.3985,74.2385,33.9488,123.1977,157.9442,88.1227,60.3837,84.7024,35.6815,107.3689,141.0915,73.5224,415.618,8.633247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2018,48796.138,49276.961,24331.054,24945.907,43.8529,97.5353,29.6736,467.285,9.53,961.647,1.952,35.5096,727.649,131.012,14.841,1.7868,0.8539,26.823,104.5,260.364,144.941,115.423,5.311,76.7479,73.8421,79.6597,63.05,60.2293,65.8597,18.1368,16.7471,19.3652,7.5196,6.7639,8.0783,8.716,11.9679,720.031,10.276,14.0874,54.3932,74.565,33.586,122.77,157.7746,87.356,61.1154,85.8714,35.9247,107.4454,141.4585,73.2214,494.364,10.083247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2019,49757.785,50187.406,24779.301,25408.106,44.6632,97.5252,30.0377,463.436,9.271,859.243,1.712,40.4876,733.94,126.781,14.682,1.765,0.8434,26.872,104.6,270.504,151.165,119.339,5.411,76.7523,73.7998,79.721,63.0269,60.1591,65.8941,18.1682,16.7473,19.431,7.5553,6.7804,8.1323,8.511,11.5935,726.503,10.041,13.6627,55.0823,75.6392,33.8578,123.0687,158.3005,87.3372,62.4333,87.5412,36.8336,108.0507,142.2845,73.5153,395.803,7.918247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2020,50617.028,50930.662,25139.588,25791.075,45.3246,97.474,30.4097,397.835,7.829,627.269,1.232,56.2619,733.491,121.663,14.435,1.7369,0.8305,26.912,104.6,335.656,191.26,144.396,6.606,74.7692,71.5365,78.1362,60.9525,57.7989,64.2239,16.6634,14.9753,18.2591,7.0308,6.229,7.6415,8.237,11.2177,726.305,9.697,13.1843,55.0287,74.9347,34.4943,140.9395,181.2286,100.096,68.1517,94.3241,41.5122,126.9074,166.2696,87.2104,229.437,4.515247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2021,51244.297,51516.562,25415.242,26101.321,45.846,97.3715,30.7725,332.554,6.469,544.53,1.057,65.5768,730.203,118.645,14.204,1.7168,0.8197,26.935,104.6,397.649,226.497,171.151,7.735,72.8296,69.4041,76.4421,58.9838,55.6203,62.5192,15.9966,14.4072,17.485,6.9963,6.2707,7.536,8,10.943,723.229,9.471,12.9233,63.5047,87.68,38.4849,175.774,226.3475,123.8155,87.1411,121.5977,51.8059,162.2411,212.2602,111.0627,211.978,4.123247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2022,51788.827,51874.024,25575.607,26298.417,46.1641,97.2515,31.171,338.317,6.512,170.394,0.328,,723.264,113.984,13.921,1.6922,0.8083,26.978,104.5,384.947,217.744,167.203,7.409,73.6595,70.3187,77.1432,59.8219,56.5496,63.2215,16.3778,14.8191,17.8162,6.9677,6.2137,7.5318,7.888,10.886,716.393,9.371,12.8868,61.5887,85.2531,37.0922,162.6828,209.9033,114.286,81.2504,113.7146,47.952,149.0061,195.5519,101.5067,-167.924,-3.232247,170,,COL,CO,170,4,Country/Area,931,Colombia,2,Medium,2023,51959.221,52085.168,25668.667,26416.5,46.352,97.1691,31.6447,426.943,8.184,251.893,0.484,,714.422,109.523,13.694,1.6886,0.8092,27.014,104.5,287.479,159.288,128.191,5.51,77.5138,74.6074,80.4018,63.6629,60.8467,66.4437,18.4607,17.0113,19.7221,7.7129,6.8891,8.3167,7.431,10.3908,707.951,8.822,12.29,49.7935,68.8941,30.0663,114.364,147.9818,80.2788,57.0775,80.4526,33.2056,100.8394,133.3786,68.0231,-175.051,-3.355
But when i run this, it doesn't output what I'm expecting, which is just a list, with the relevant female population values (drawn from a single "cell" in the CSV/dataframe), but also adds the row location number (which I think is a default index), and also the datatype. Running the Python file looks outputs this:
['Argentina', 'Bolivia', 'Brazil', 'Chile', 'Colombia', 'Ecuador', 'Falkland Islands', 'French Guiana', 'Guyana', 'Paraguay', 'Peru', 'South Georgia and the South Sandwich Islands', 'Suriname', 'Uruguay', 'Venezuela'][36241 21854.745Name: TPopulationFemale1July, dtype: float64, 36393 5511.477Name: TPopulationFemale1July, dtype: float64, 36545 104179.515Name: TPopulationFemale1July, dtype: float64, 36697 9004.176Name: TPopulationFemale1July, dtype: float64, 36849 23847.355Name: TPopulationFemale1July, dtype: float64, 37001 8095.674Name: TPopulationFemale1July, dtype: float64, 37153 1.71Name: TPopulationFemale1July, dtype: float64, 37305 129.182Name: TPopulationFemale1July, dtype: float64, 37457 383.41Name: TPopulationFemale1July, dtype: float64, 37609 3071.251Name: TPopulationFemale1July, dtype: float64, 37761 15496.198Name: TPopulationFemale1July, dtype: float64, 37913 287.482Name: TPopulationFemale1July, dtype: float64, 38065 1759.187Name: TPopulationFemale1July, dtype: float64, 38217 15341.128Name: TPopulationFemale1July, dtype: float64]Total South American female population, using WHO definitions = 36089 208962.491Name: TPopulationFemale1July, dtype: float64
Whereas I was just hoping for a list of numbers (the numbers at the end of each row here) such that I could add them up. That is, the numbers are correct, but I can't add them up because of all the extraneous information that seems to be being added at the same time.
Can anyone offer any suggestions? Thanks