首页 > 编程知识 正文

如何根据条件选择 Pandas 数据帧中的行

时间:2023-11-26 13:13:13 阅读:309399 作者:PWYF

在本教程中,我们将学习用户如何使用 Python 基于条件选择 Pandas 数据帧中的行。

用户可以使用“>”、“=”、“< =”、“> =”、“!= '运算符。

条件:

我们将讨论适用于 Pandas 数据帧的不同条件。

条件 1:

使用基本方法从数据帧中选择“百分比”大于 70 的所有行。

代码:


# First, import pandas
import pandas as pnd
record_1 = {

 'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
 'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
 'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
 'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }

# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subject_1', 'Percentage_1'])

print("Given DataFrame: n", Data_Frame) 

# Then we will select rows based on condition
result_DataFrame = Data_Frame[Data_Frame['Percentage_1'] > 70]

print('nFollowing is the Result DataFrame: n', result_DataFrame)

输出:

Given DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
1    Ashu     24          ADS            62
2   Yashi     21         ASPM            85
3    Mark     19          BCM            71
4  Joshua     21         MFCS            55
5    John     24          ADS            78
6     Ray     25         ASPM            70
7   Lilly     22          TOC            66
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

Following is the Result DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
2   Yashi     21         ASPM            85
3    Mark     19          BCM            71
5    John     24          ADS            78
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

条件 2:

使用“ loc[] ”方法从数据帧中选择“百分比”大于 70 的所有行。

代码:


# First, import pandas
import pandas as pnd
record_1 = {

 'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
 'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
 'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
 'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }

# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])

print("Given DataFrame: n", Data_Frame) 

# Then we will select rows based on condition, That is, Using loc[] method
result_DataFrame = Data_Frame.loc[Data_Frame['Percentage_1'] > 70]

print('nFollowing is the Result DataFrame: n', result_DataFrame)

输出:

Given DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
1    Ashu     24          ADS            62
2   Yashi     21         ASPM            85
3    Mark     19          BCM            71
4  Joshua     21         MFCS            55
5    John     24          ADS            78
6     Ray     25         ASPM            70
7   Lilly     22          TOC            66
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

Following is the Result DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
2   Yashi     21         ASPM            85
3    Mark     19          BCM            71
5    John     24          ADS            78
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

条件 3:

使用“ loc[] ”方法从数据帧中选择“百分比”不等于 71 的所有行。

代码:


# First, import pandas
import pandas as pnd
record_1 = {

 'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
 'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
 'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
 'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }

# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])

print("Given DataFrame: n", Data_Frame) 

# Then we will select rows based on condition, That is, Using loc[] method
result_DataFrame = Data_Frame.loc[Data_Frame['Percentage_1'] != 71]

print('nFollowing is the Result DataFrame: n', result_DataFrame)

输出:

Given DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
1    Ashu     24          ADS            62
2   Yashi     21         ASPM            85
3    Mark     19          BCM            71
4  Joshua     21         MFCS            55
5    John     24          ADS            78
6     Ray     25         ASPM            70
7   Lilly     22          TOC            66
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

Following is the Result DataFrame: 
    Name_1  Age_1 Subjects_1  Percentage_1
0    Anuj     23       DBMS            88
1    Ashu     24        ADS            62
2   Yashi     21       ASPM            85
4  Joshua     21       MFCS            55
5    John     24        ADS            78
6     Ray     25       ASPM            70
7   Lilly     22        TOC            66
9  Rachel     22       OOPS            89

现在,我们将学习如何使用 DataFrame 的“isin()”函数来选择那些列值出现在列表中的行。

条件 4:

使用基本方法,选择给定数据帧中“科目 _1 ”列值出现在“科目 _2 列表中的所有行。

代码:


# First, import pandas
import pandas as pnd
record_1 = {

 'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
 'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
 'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
 'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }

# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])

print("Given DataFrame: n", Data_Frame) 

Subjects_2 = ['ASPM', 'ADS', 'TOC']

# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame[Data_Frame['Subjects_1'].isin(Subjects_2)]

print('nFollowing is the Result DataFrame: n', result_DataFrame)

输出:

Given DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
1    Ashu     24          ADS            62
2   Yashi     21         ASPM            85
3    Mark     19          BCM            71
4  Joshua     21         MFCS            55
5    John     24          ADS            78
6     Ray     25         ASPM            70
7   Lilly     22          TOC            66
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

Following is the Result DataFrame: 
   Name_1  Age_1 Subjects_1  Percentage_1
1   Ashu     24        ADS            62
2  Yashi     21       ASPM            85
5   John     24        ADS            78
6    Ray     25       ASPM            70
7  Lilly     22        TOC            66

条件 5:

使用“ loc[] ”方法选择给定数据帧中“主题 _1 ”列值出现在“主题 _2 ”列表中的所有行。

代码:


# First, import pandas
import pandas as pnd
record_1 = {

 'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
 'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
 'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
 'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }

# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])

print("Given DataFrame: n", Data_Frame) 

Subjects_2 = ['ASPM', 'ADS', 'TOC']

# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame.loc[Data_Frame['Subjects_1'].isin(Subjects_2)]

print('nFollowing is the Result DataFrame: n', result_DataFrame)

输出:

Given DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
1    Ashu     24          ADS            62
2   Yashi     21         ASPM            85
3    Mark     19          BCM            71
4  Joshua     21         MFCS            55
5    John     24          ADS            78
6     Ray     25         ASPM            70
7   Lilly     22          TOC            66
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

Following is the Result DataFrame: 
   Name_1  Age_1 Subjects_1  Percentage_1
1   Ashu     24        ADS            62
2  Yashi     21       ASPM            85
5   John     24        ADS            78
6    Ray     25       ASPM            70
7  Lilly     22        TOC            66

条件 6:

使用“ loc[] ”方法选择给定数据帧中“主题 _1 ”列值不在“主题 _2 ”列表中的所有行。

代码:


# First, import pandas
import pandas as pnd
record_1 = {

 'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
 'Age_1': [23, 24, 21, 19, 21, 24, 25, 22, 23, 22],
 'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
 'Percentage_1': [88, 62, 85, 71, 55, 78, 70, 66, 71, 89] }

# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])

print("Given DataFrame: n", Data_Frame) 

Subjects_2 = ['ASPM', 'ADS', 'TOC']

# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame.loc[~Data_Frame['Subjects_1'].isin(Subjects_2)]

print('nFollowing is the Result DataFrame: n', result_DataFrame)

输出:

Given DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
1    Ashu     24          ADS            62
2   Yashi     21         ASPM            85
3    Mark     19          BCM            71
4  Joshua     21         MFCS            55
5    John     24          ADS            78
6     Ray     25         ASPM            70
7   Lilly     22          TOC            66
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

Following is the Result DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
3    Mark     19          BCM            71
4  Joshua     21         MFCS            55
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

现在,我们将学习如何使用“&”运算符基于多列条件选择行。

条件 7:

使用基本方法,从给定数据帧中选择“百分比 _1 ”等于“ 71 ”且“主题 _1 ”出现在“主题 _2 列表中的所有行。

代码:


# First, import pandas
import pandas as pnd
record_1 = {

 'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
 'Age_1': [23, 21, 21, 19, 21, 24, 25, 22, 23, 22],
 'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
 'Percentage_1': [88, 71, 71, 82, 55, 78, 70, 66, 71, 89] }

# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])

print("Given DataFrame: n", Data_Frame) 

Subjects_2 = ['ASPM', 'ADS', 'TOC']

# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame[(Data_Frame['Percentage_1'] == 71) &
                              Data_Frame['Subjects_1'].isin(Subjects_2)]

print('nFollowing is the Result DataFrame: n', result_DataFrame)

输出:

Given DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
1    Ashu     21          ADS            71
2   Yashi     21         ASPM            71
3    Mark     19          BCM            82
4  Joshua     21         MFCS            55
5    John     24          ADS            78
6     Ray     25         ASPM            70
7   Lilly     22          TOC            66
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

Following is the Result DataFrame: 
   Name_1  Age_1 Subjects_1  Percentage_1
1   Ashu     21        ADS            71
2  Yashi     21       ASPM            71

条件 8:

使用“loc【】”方法选择给定数据帧中“百分比 _1”等于“ 71 ”且“主题 _1 ”出现在“主题 _2 ”列表中的所有行。

代码:


# First, import pandas
import pandas as pnd
record_1 = {

 'Name_1': ['Anuj', 'Ashu', 'Yashi', 'Mark', 'Joshua', 'John', 'Ray', 'Lilly', 'Rose', 'Rachel' ],
 'Age_1': [23, 21, 21, 19, 21, 24, 25, 22, 23, 22],
 'Subjects_1': ['DBMS', 'ADS', 'ASPM', 'BCM', 'MFCS', 'ADS', 'ASPM', 'TOC', 'Data Mining', 'OOPS'],
 'Percentage_1': [88, 71, 71, 82, 55, 78, 70, 66, 71, 89] }

# Now, we are creating a dataframe
Data_Frame = pnd.DataFrame(record_1, columns = ['Name_1', 'Age_1', 'Subjects_1', 'Percentage_1'])

print("Given DataFrame: n", Data_Frame) 

Subjects_2 = ['ASPM', 'ADS', 'TOC']

# Then we will select rows based on condition, That is, Using isin[] method
result_DataFrame = Data_Frame.loc[(Data_Frame['Percentage_1'] == 71) &
                              Data_Frame['Subjects_1'].isin(Subjects_2)]

print('nFollowing is the Result DataFrame: n', result_DataFrame)

输出:

Given DataFrame: 
    Name_1  Age_1   Subjects_1  Percentage_1
0    Anuj     23         DBMS            88
1    Ashu     21          ADS            71
2   Yashi     21         ASPM            71
3    Mark     19          BCM            82
4  Joshua     21         MFCS            55
5    John     24          ADS            78
6     Ray     25         ASPM            70
7   Lilly     22          TOC            66
8    Rose     23  Data Mining            71
9  Rachel     22         OOPS            89

Following is the Result DataFrame: 
   Name_1  Age_1 Subjects_1  Percentage_1
1   Ashu     21        ADS            71
2  Yashi     21       ASPM            71

结论

在本教程中,我们讨论了如何根据各种条件选择 Pandas 数据帧的不同行。


版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。