# Selecting rows and columns

## Learning Objectives

After working through this topic, you should be able to:

- Select rows and individual cells via `.loc`
- Select single and multiple columns
- Explain why label based selection is better than position based selection
- Select rows based on Boolean Series
- Select rows via queries

## Materials

Video:

<iframe
  src="https://electure.uni-bonn.de/paella7/ui/watch.html?id=734a4d6b-f641-46e4-aacd-393a9594d39f"
  width="640"
  height="360"
  frameborder="0"
  allowfullscreen
></iframe>

Download the [slides](pandas-selection.pdf).







## Quiz

In [None]:
content = [
    {
        "question": "What will the output of this code snippet be?",
        "code": """>>> df
   a  b
x  0  1
y  2  3

>>> df['a']
""",
        "type": "multiple_choice",
        "answers": [
            {
                "answer": "(pd.Series)",
                "code": """x    0
y    2
Name: a, dtype: int64
""",
                "correct": True,
                "feedback": "",
            },
            {
                "answer": "(DataFrame)",
                "code": """   a
x  0
y  2
""",
                "correct": False,
                "feedback": "Almost, but single brackets yield a Series.",
            },
            {
                "code": """KeyError: "None of
[Index(['a'], dtype='string')]
are in the [index]"
""",
                "correct": False,
                "feedback": "Indexing a DataFrame yields columns, not rows.",
            },
            {
                "code": "KeyError: 'a'",
                "correct": False,
                "feedback": "Indexing a DataFrame yields columns, not rows.",
            },
        ],
    },
    {
        "question": "What will the output of this code snippet be?",
        "code": """>>> df
   a  b
x  0  1
y  2  3

>>> df.loc['b']
""",
        "type": "multiple_choice",
        "answers": [
            {
                "answer": "(pd.Series)",
                "code": """x    1
y    3
Name: b, dtype: int64
""",
                "correct": False,
                "feedback": "The `.loc` indexer yields rows, not columns.",
            },
            {
                "answer": "(DataFrame)",
                "code": """   b
x  1
y  3
""",
                "correct": False,
                "feedback": "The `.loc` indexer yields rows, not columns.",
            },
            {
                "code": """KeyError: "None of
[Index(['b'], dtype='string')]
are in the [index]"
""",
                "correct": False,
                "feedback": (
                    "Almost, but this is the error message when passing a list."
                ),
            },
            {
                "code": "KeyError: 'b'",
                "correct": True,
                "feedback": "Indeed, 'b' is not among the rows.",
            },
        ],
    },
    {
        "question": "What will the output of this code snippet be?",
        "code": """>>> df
   a  b
x  0  1
y  2  3

>>> df.loc[['y']]
""",
        "type": "multiple_choice",
        "answers": [
            {
                "answer": "(pd.Series)",
                "code": """a    2
b    3
Name: y, dtype: int64
""",
                "correct": False,
                "feedback": "Almost, but a nested list brackets yield a DataFrame.",
            },
            {
                "answer": "(DataFrame)",
                "code": """   a  b
y  2  3
""",
                "correct": True,
                "feedback": "",
            },
            {
                "code": """KeyError: "None of
[Index(['y'], dtype='string')]
are in the [columns]"
""",
                "correct": False,
                "feedback": "The `.loc` indexer yields rows, not columns.",
            },
            {
                "code": "KeyError: 'y'",
                "correct": False,
                "feedback": "The `.loc` indexer yields rows, not columns.",
            },
        ],
    },
    {
        "question": "What will the output of this code snippet be?",
        "code": """>>> df
   a  b
x  0  1
y  2  3

>>> df[['x']]
""",
        "type": "multiple_choice",
        "answers": [
            {
                "answer": "(pd.Series)",
                "code": """a    0
b    1
Name: x, dtype: int64
""",
                "correct": False,
                "feedback": "Indexing into a DataFrame yields columns, not rows.",
            },
            {
                "answer": "(DataFrame)",
                "code": """   a  b
y  0  1
""",
                "correct": False,
                "feedback": "Indexing into a DataFrame yields columns, not rows.",
            },
            {
                "code": """KeyError: "None of
[Index(['x'], dtype='string')]
are in the [columns]"
""",
                "correct": True,
                "feedback": "The columns are labelled 'a' and 'b'.",
            },
            {
                "code": "KeyError: 'x'",
                "correct": False,
                "feedback": (
                    "Almost, but this is the error message when indexing with an "
                    "inexistent column name, not when passing a list of potential "
                    "column names."
                ),
            },
        ],
    },
]

from jupyterquiz import display_quiz

display_quiz(content, colors="fdsp")