Image of Python Head | Pandas DataFrame: head() Method

ADVERTISEMENT

Table of Contents

Introduction

Pandas is a powerful Python library that can be used to manipulate and analyze data. Pandas has many methods to view and manipulate data, one example of which is the pandas.head() function. The pandas.head() function allows you to view the first n rows of a data structure, and it can be performed on both pandas DataFrames and Series objects.

In this course you will practice how to use the python pandas.head() function with data and how it can be used in machine learning, as well as how to use the pandas.tail() function.

What is the head () in Python?

Python Pandas has two main data structures: DataFrames for two dimensional data and Series for one dimensional data. The head function is a python Pandas method that can be used on a Pandas DataFrame as well a Pandas Series. The head function returns by default the first 5 rows of the data structure it is called on. You can also pass in a parameter to the function to specify a certain number of rows.

How do you use a head in Python?

To call the function on a dataframe df we can use this syntax: df.head() To call the function on a series s we can use this syntax: s.head()

The head function has one optional parameter to specify the number of rows that are shown. It's default value is 5. We can specify another number by passing it in as an argument, for example: df.head(10). If we pass in a negative value -n, it will return all of the rows except the last n. If we pass in an argument that is not an integer, we will get a type error:

>>>df.head(2.5)
TypeError: cannot do positional indexing on RangeIndex with these indexers [2.5]

You can learn more about type errors here.

What is the use of head function with a series?

The head method can also be called on a python Pandas Series. Let's look at an example of how we can use the head function on a Series with data. Let's first create a complete series with sample data:

>>> s = pd.Series([543, 100, 235, 572, 293])
>>> print(s)
0    543
1    100
2    235
3    572
4    293
dtype: int64

If we call the head function on the Series without any arguments, it will return the first 5 rows. We can also pass in an argument to specify the number of rows:

>>> head = s.head(3)
>>> print(head)
0    543
1    100
2    235
dtype: int64

How do I get the head of a DataFrame in Python?

The head method can be called on a python Pandas DataFrame. Let's look at an example of how we can use the head function on a DataFrame. We first want to import pandas:

>>> import pandas as pd

Next, let's create a sample pandas DataFrame:

>>> df = pd.util.testing.makeMixedDataFrame()
     A    B     C          D
0  0.0  0.0  foo1 2009-01-01
1  1.0  1.0  foo2 2009-01-02
2  2.0  0.0  foo3 2009-01-05
3  3.0  1.0  foo4 2009-01-06
4  4.0  0.0  foo5 2009-01-07

Now we can call the head function on the DataFrame to display the first 5 rows:

>>> head = df.head()
>>> print(head)
     A    B     C          D
0  0.0  0.0  foo1 2009-01-01
1  1.0  1.0  foo2 2009-01-02
2  2.0  0.0  foo3 2009-01-05
3  3.0  1.0  foo4 2009-01-06
4  4.0  0.0  foo5 2009-01-07

We can also pass in an argument to specify the number of rows returned:

>>> head = df.head(2)
>>> print(head)
     A    B     C          D
0  0.0  0.0  foo1 2009-01-01
1  1.0  1.0  foo2 2009-01-02

What is head () and tail () function?

The tail() function is a python Pandas method that is very similar to the head() function. The tail function instead returns the last n rows of the data structure. Similarly to the head function, you can pass in an argument the specify the number of rows: df.tail(6). Let's look at an example of how to use the tail function on a DataFrame with data in python:

>>> tail = df.tail(4)
>>> print(tail)
     A    B     C          D
1  1.0  1.0  foo2 2009-01-02
2  2.0  0.0  foo3 2009-01-05
3  3.0  1.0  foo4 2009-01-06
4  4.0  0.0  foo5 2009-01-07

Pandas ML Exercises

Python Pandas and Pandas data structures can very easily be paired with a machine learning libraries such as TensorFlow and Sci-kit Learn. Pandas comes in handy when cleaning, organizing, and structuring your database for a machine learning model. Let's go over an example of how you can use pandas and sci-kit learn to create a linear regression model.

Importing the Data

Pandas offers many ways to import data directly into a dataframe, which can come in handy. The pandas.read_csv function allows you to import a csv directly into a dataframe:

>>> df = pd.read_csv("data.csv")
>>> print(df)
 	Model 	A 	B 	C 	D
0 	1 	9.3 	1231 	9 	21
1 	2 	8.9 	2139 	8 	35
2 	3 	2.3 	9098 	4 	35
3 	4 	6.2 	2372 	9 	23
4 	5 	7.1 	7432 	5 	22

Formatting the Data

The df.get_dummies function allows you to use one hot encoding for any categorical data. Note that this only needs to be done for non numerical data.

>>> encode = pd.get_dummies(df["C"])
>>>print(encode )
 	no 	yes
0 	0 	1
1 	1 	0
2 	1 	0
3 	0 	1
4 	1 	0

You can then add the new encoding to the dataframe:

>>> df = df.drop('C', axis=1)
>>> data = df.join(encode)
>>> print(data)
 	Model 	A 	B 	D 	no 	yes
0 	1 	9.3 	1231 	21 	0 	1
1 	2 	8.9 	2139 	35 	1 	0
2 	3 	2.3 	9098 	35 	1 	0
3 	4 	6.2 	2372 	23 	0 	1
4 	5 	7.1 	7432 	22 	1 	0

Creating a model with the data

We can now use sci-kit learn to create a linear regression model. You also need a dataframe holding your target values:

>>> y = pd.read_csv("results.csv")
>>> print(y)
	Target
0 	98
1 	34
2 	23
3 	55
4 	62

Now we can import the LinearRegression() and train_test_split functions from sci-kit learn, and then train the model using the fit() method.

>>> X_train, X_test, y_train, y_test = train_test_split(data, y, test_size=0.3, train_size=0.3)
>>> model = LinearRegression()
>>> model.fit(X_train, y_train)

Once the model is trained, we can predict using the predict() method on our model:

>>>print(newX)
 	Model 	A 	B 	C 	no 	yes
0 	1 	6.7 	1321 	26 	0 	1
>>> P = model.predict(newX)
>>> print(P)
[55]

Summary

In this tutorial you learned what the Pandas head function is used for and how to use it. First, you learned the syntax. Then, you learned how to use the head function on both pandas dataframes and series. Then, you learned how to use the pandas tail function. Finally, you learned how pandas can be used for machine learning in Python.

Next steps

If you're interested in learning more about the basics of Python, coding, and software development, check out our Coding Essentials Guidebook for Developers, where we cover the essential languages, concepts, and tools that you'll need to become a professional developer.

Thanks and happy coding! We hope you enjoyed this article. If you have any questions or comments, feel free to reach out to jacob@initialcommit.io.

References

  1. Pandas Documentation Head - https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.head.html
  2. Pandas Documentation Tail - https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.tail.html
  3. Pandas head - https://appdividend.com/2022/06/15/pandas-head/

Final Notes