Agroalimentary (2 points)

From the Catalan Government web we have downloaded the file agroalimentary.csv related to Catalan agroalimentary companies, that you already know. We have extended it with a randomly generated column named Facturacio showing the company billing in thousands of euros (int). Download and study this file. We have already created a DataFrame with this data.

The following example shows the names of all the columns of this DataFrame, then a subdataframe example with columns corresponding to the company name (Establiment), the region (Comarca), the company type (Titularitat) and the billing (Facturacio), and, below, the same subdataframe with columns Establiment and Sector:

>>> list(dfagro.columns)
['Establiment', 'Adreça', 'Codi postal', 'Municipi', 'Comarca', 'Titularitat', 'Sector', 'Latitud', 'Longitud', 'Facturacio']

>>> dfagro[['Establiment', 'Comarca', 'Titularitat', 'Facturacio']][15:25]
                Establiment            Comarca        Titularitat  Facturacio
15        AVICOLA SURIA, SA              Bages   SOCIETAT ANÒNIMA        1396
16     BALAGUER I CABRÉ, SL            Priorat  SOCIETAT LIMITADA        5496
17  BIO OLEICS BELIANES, SL             Urgell  SOCIETAT LIMITADA        4194
18     BODEGAS J. TRIAS, SA        Alt Penedès   SOCIETAT ANÒNIMA        3183
19   BODEGAS VILA CORONA SL      Pallars Jussà  SOCIETAT LIMITADA         644
20   BODEGUES VISENDRA, SCP           Alt Camp             ALTRES         704
21          CA L'ARENYS, SL              Bages  SOCIETAT LIMITADA        4115
22           CAL GASSET, SL   Conca de Barberà  SOCIETAT LIMITADA        6187
23     CANAL SELMA, JOAQUIM           Ripollès         PARTICULAR        5710
24       CARNE GUAL, ESTEVE  Vallès Occidental         PARTICULAR        1964

>>> dfagro[['Establiment', 'Sector']][15:25]
                Establiment                                             Sector
15        AVICOLA SURIA, SA   10.8 - Fabricació d'altres productes alimentaris
16     BALAGUER I CABRÉ, SL                       11.0 - Fabricació de begudes
17  BIO OLEICS BELIANES, SL  10.4 - Fabricació d'olis i greixos vegetals i ...
18     BODEGAS J. TRIAS, SA                       11.0 - Fabricació de begudes
19   BODEGAS VILA CORONA SL                       11.0 - Fabricació de begudes
20   BODEGUES VISENDRA, SCP                       11.0 - Fabricació de begudes
21          CA L'ARENYS, SL                       11.0 - Fabricació de begudes
22           CAL GASSET, SL                       11.0 - Fabricació de begudes
23     CANAL SELMA, JOAQUIM              10.5 - Fabricació de productes lactis
24       CARNE GUAL, ESTEVE                         16. - Indústries forestals

Save the two following functions to file agro.py.

  1. Write function agro1() that takes a DataFrame as that described, a sector (str) and a company type (str). This function returns a tuple with a DataFrame and an integer (int). The returned DataFrame is a subdataframe of the given one, with only those rows that correspond to the given sector and company type and the two columns 'Establiment' and 'Facturacio'. This subdataframe is sorted by the column 'Facturacio' in descending order. The returned integer is the total billing (column 'Facturacio') of the computed subdataframe (1 point). Examples:

    >>> df1, total1 = agro1(dfagro, '16. - Indústries forestals', 'SOCIETAT LIMITADA')
    >>> total1
    80128
    >>> type(df1)
    <class 'pandas.core.frame.DataFrame'>
    >>> len(df1)
    16
    >>> df1.head()
                     Establiment  Facturacio
    581  SERRERIAS FONTBONA,S.L.        9149
    137      PALETS SANFELIU, SL        8241
    219         BOSCAT FUSTA, SL        8227
    658     FUSTES TORRUELLA, SL        7917
    13               ARTICOR, SL        7818
    >>> list(df1.iloc[10])
    ['SERRADORA BELLVER, SL', 3267]
    

    Note

    You have more tests in file test-agro1.txt

    Hint

    Use selection, methods sort_values and sum.

  2. Write function agro2() that takes a DataFrame, as that described and two latitude values, \(la1\) and \(la2\) (float), such that \(la1<la2\). This function considers the subset of companies located at a latitude \(la\) such that \(la1 \leq la \leq la2\) and returns a Series in which the index is the sector (str) and the value is the billing mean of the companies of this sector (float), for the given latitudes inteval (1 point). Examples:

    >>> s1 = agro2(dfagro, 41.2, 41.4)
    >>> type(s1)
    <class 'pandas.core.series.Series'>
    >>> round(s1['10.1 - Sacrifici de bestiar i conservació de carn i elaboració de productes carnis'], 2)
    5360.05
    >>> round(s1['10.5 - Fabricació de productes lactis'], 2)
    3432.5
    >>> round(s1['11.0 - Fabricació de begudes'], 2)
    4742.17
    >>> round(s1['16. - Indústries forestals'], 2)
    2594.33
    

    Note

    You have more tests in file test-agro2.txt

    Hint

    Use selection and methods groupby and mean.

Note

To run your functions you don’t actually need to read the given data file. But you need to download and save it into your working folder because the given testfiles read from it to create the DataFrame.