Work with rows of a dataframe or series object
row_element = df[<column>][<row>]
entire_row = df.iloc[0]
Rows are represented as Series
object for whom the index labels represent
what would be their coulmns in a dataframe
When you apply the len()
function to a dataframe you will get the number of
rows in your dataframe
len(df) # number of rows in the dataframe
len(df['Scores']) # number of rows in a columns
df['Scores'].count() # number of rows in a columns
df.index # A list of al the index values
df.index[1] # Get the name of your second row
max(df.index) # maximum row index
print(df.index) # RangeIndex(start=0, stop=100, step=1)
You can also use the self.shape
attribute to get the dimensionality of the
dataframe. This will return a tuple of rows and columns
print(df.shape) # (rows, columns)
You can also access a whole row by it's row label using the loc[]
accessor or iloc[]
to reference it by it's row
index number . If a single row is referenced by it's label
, then it will return an instance of a series
object
df.loc[102] # Get all the elements for each column in row labeled 102
df.iloc[0] # Get all the elements for each column in the first row
One way to insert a new row to your dataframe is by first creating it as an
stand alone series using the Series
class
With the following arguments:
data
: The data as a list for each column of the row (in order)index
: The index labels are the would be column namesname
: The would be label for the new row
After creating the Series
representing our new row. You can append the
new row to the end of the dataframe with the append()
method
The append()
method does not perform mutation and instead returns a new
dataframe
john = pd.Series(data=['John', 'Boston', 34, 79], index=df.columns, name=17)
df = df.append(john)
If creating a new Series
object is overkill for you. You can pass a
dictionary as the row data and pass the ignore_index=True
argument
df = df.append(
{
"first_name": first_name,
"last_name": last_name,
"title": book_title,
"publisher": publisher_name,
},
ignore_index=True,
)
With the drop()
method you can delete an arbitrary row by it's row label.
This does not performe mutation on your dataframe and instead return a new dataframe object
df = df.drop(labels=[17])
You can also use loc[]
and pass a conditional expression filter
for the rows
nba.loc[nba["fran_id"] == "Lakers", "team_id"].value_counts()
To get apply an aggregation function on one or every row
of a dataframe you need to pass the axis="columns"
to the aggregationn function
df["Totals"] = df.sum(axis="columns")