Imagine that we want to determine whether unemployment was high (> 6.5), medium (4.5 < x <= 6.5), or low (<= 4.5) for each state and each month. Write a Python function that takes a single number as an input and outputs a single string noting if that number is high, medium, or low. Pass your function to applymap (quiz: why applymap and not agg or apply?) and save the result in a new DataFrame called unemp_bins. (Challenging) This exercise has multiple parts: Use another transform on unemp_bins to count how many times each state had each of the three classifications. - Hint 1: Will this value counting function be a Series or scalar transform? - Hint 2: Try googling "pandas count unique value" or something similar to find the right transform. Construct a horizontal bar chart of the number of occurrences of each level with one bar per state and classification (21 total bars). (Challenging) Repeat the previous step, but count how many states had each classification in each month. Which month had the most states with high unemployment? What about medium and low?

时间: 2024-02-14 20:08:03 浏览: 107

Sure, I can help you with that. Here is the Python function that takes a single number as an input and outputs a single string noting if that number is high, medium, or low: ```python def classify_unemployment(x): if x > 6.5: return 'high' elif x <= 4.5: return 'low' else: return 'medium' ``` We use `applymap()` instead of `agg()` or `apply()` because we want to apply the function element-wise to each element in the DataFrame. Here's how you can use this function with `applymap()` to create a new DataFrame called `unemp_bins`: ```python unemp_bins = df.applymap(classify_unemployment) ``` To count how many times each state had each of the three classifications, we can use the `value_counts()` function on each column of `unemp_bins`. Since `value_counts()` returns a Series, we can use `apply()` and pass `value_counts` as a parameter to apply the function to each column: ```python state_counts = unemp_bins.apply(lambda x: x.value_counts()) ``` To construct a horizontal bar chart of the number of occurrences of each level with one bar per state and classification, we can use the `plot()` method on the `state_counts` DataFrame. We set `kind='barh'` to create a horizontal bar chart: ```python state_counts.plot(kind='barh', title='Unemployment Classification by State') ``` To count how many states had each classification in each month, we can use `groupby()` on the `unemp_bins` DataFrame, grouping by the columns 'month' and the classification ('high', 'medium', or 'low'). Then we can count the number of unique states in each group using the `nunique()` function: ```python month_counts = unemp_bins.groupby(['month', 'high', 'medium', 'low']).agg({'state': 'nunique'}) ``` To find out which month had the most states with high, medium, and low unemployment, we can use the `idxmax()` function on each column of `month_counts`. This returns the index (month, classification) of the maximum value in each column: ```python high_month = month_counts.loc[month_counts['high'].idxmax()].name[0] medium_month = month_counts.loc[month_counts['medium'].idxmax()].name[0] low_month = month_counts.loc[month_counts['low'].idxmax()].name[0] ``` This will give you the month with the most states with high, medium, and low unemployment.

阅读全文

相关推荐

CSDN会员

开通CSDN年卡参与万元壕礼抽奖

海量 VIP免费资源千本正版电子书商城会员专享价千门课程&专栏

全年可省5,000元立即开通