Python Head | Pandas DataFrame: head() Method
ADVERTISEMENT
Table of Contents
- Introduction
- What is the head () in Python?
- How do you use a head in Python?
- What is the use of head function with a series?
- How do I get the head of a DataFrame in Python?
- What is head () and tail () function?
- Pandas ML Exercises
- Formatting the Data
- Summary
- Next steps
- References
Introduction
Pandas is a powerful Python library that can be used to manipulate and analyze data. Pandas has many methods to view and manipulate data, one example of which is the pandas.head() function. The pandas.head()
function allows you to view the first n rows of a data structure, and it can be performed on both pandas DataFrames and Series objects.
In this course you will practice how to use the python pandas.head() function with data and how it can be used in machine learning, as well as how to use the pandas.tail() function.
What is the head () in Python?
Python Pandas has two main data structures: DataFrames for two dimensional data and Series for one dimensional data. The head function is a python Pandas method that can be used on a Pandas DataFrame as well a Pandas Series. The head function returns by default the first 5 rows of the data structure it is called on. You can also pass in a parameter to the function to specify a certain number of rows.
How do you use a head in Python?
To call the function on a dataframe df
we can use this syntax: df.head()
To call the function on a series s
we can use this syntax: s.head()
The head function has one optional parameter to specify the number of rows that are shown. It's default value is 5. We can specify another number by passing it in as an argument, for example: df.head(10)
. If we pass in a negative value -n, it will return all of the rows except the last n. If we pass in an argument that is not an integer, we will get a type error:
>>>df.head(2.5)
TypeError: cannot do positional indexing on RangeIndex with these indexers [2.5]
You can learn more about type errors here.
What is the use of head function with a series?
The head method can also be called on a python Pandas Series. Let's look at an example of how we can use the head function on a Series with data. Let's first create a complete series with sample data:
>>> s = pd.Series([543, 100, 235, 572, 293])
>>> print(s)
0 543
1 100
2 235
3 572
4 293
dtype: int64
If we call the head function on the Series without any arguments, it will return the first 5 rows. We can also pass in an argument to specify the number of rows:
>>> head = s.head(3)
>>> print(head)
0 543
1 100
2 235
dtype: int64
How do I get the head of a DataFrame in Python?
The head method can be called on a python Pandas DataFrame. Let's look at an example of how we can use the head function on a DataFrame. We first want to import pandas:
>>> import pandas as pd
Next, let's create a sample pandas DataFrame:
>>> df = pd.util.testing.makeMixedDataFrame()
A B C D
0 0.0 0.0 foo1 2009-01-01
1 1.0 1.0 foo2 2009-01-02
2 2.0 0.0 foo3 2009-01-05
3 3.0 1.0 foo4 2009-01-06
4 4.0 0.0 foo5 2009-01-07
Now we can call the head function on the DataFrame to display the first 5 rows:
>>> head = df.head()
>>> print(head)
A B C D
0 0.0 0.0 foo1 2009-01-01
1 1.0 1.0 foo2 2009-01-02
2 2.0 0.0 foo3 2009-01-05
3 3.0 1.0 foo4 2009-01-06
4 4.0 0.0 foo5 2009-01-07
We can also pass in an argument to specify the number of rows returned:
>>> head = df.head(2)
>>> print(head)
A B C D
0 0.0 0.0 foo1 2009-01-01
1 1.0 1.0 foo2 2009-01-02
What is head () and tail () function?
The tail() function is a python Pandas method that is very similar to the head() function. The tail function instead returns the last n rows of the data structure. Similarly to the head function, you can pass in an argument the specify the number of rows: df.tail(6)
. Let's look at an example of how to use the tail function on a DataFrame with data in python:
>>> tail = df.tail(4)
>>> print(tail)
A B C D
1 1.0 1.0 foo2 2009-01-02
2 2.0 0.0 foo3 2009-01-05
3 3.0 1.0 foo4 2009-01-06
4 4.0 0.0 foo5 2009-01-07
Pandas ML Exercises
Python Pandas and Pandas data structures can very easily be paired with a machine learning libraries such as TensorFlow and Sci-kit Learn. Pandas comes in handy when cleaning, organizing, and structuring your database for a machine learning model. Let's go over an example of how you can use pandas and sci-kit learn to create a linear regression model.
Importing the Data
Pandas offers many ways to import data directly into a dataframe, which can come in handy. The pandas.read_csv
function allows you to import a csv directly into a dataframe:
>>> df = pd.read_csv("data.csv")
>>> print(df)
Model A B C D
0 1 9.3 1231 9 21
1 2 8.9 2139 8 35
2 3 2.3 9098 4 35
3 4 6.2 2372 9 23
4 5 7.1 7432 5 22
Formatting the Data
The df.get_dummies
function allows you to use one hot encoding for any categorical data. Note that this only needs to be done for non numerical data.
>>> encode = pd.get_dummies(df["C"])
>>>print(encode )
no yes
0 0 1
1 1 0
2 1 0
3 0 1
4 1 0
You can then add the new encoding to the dataframe:
>>> df = df.drop('C', axis=1)
>>> data = df.join(encode)
>>> print(data)
Model A B D no yes
0 1 9.3 1231 21 0 1
1 2 8.9 2139 35 1 0
2 3 2.3 9098 35 1 0
3 4 6.2 2372 23 0 1
4 5 7.1 7432 22 1 0
Creating a model with the data
We can now use sci-kit learn to create a linear regression model. You also need a dataframe holding your target values:
>>> y = pd.read_csv("results.csv")
>>> print(y)
Target
0 98
1 34
2 23
3 55
4 62
Now we can import the LinearRegression()
and train_test_split
functions from sci-kit learn, and then train the model using the fit()
method.
>>> X_train, X_test, y_train, y_test = train_test_split(data, y, test_size=0.3, train_size=0.3)
>>> model = LinearRegression()
>>> model.fit(X_train, y_train)
Once the model is trained, we can predict using the predict()
method on our model:
>>>print(newX)
Model A B C no yes
0 1 6.7 1321 26 0 1
>>> P = model.predict(newX)
>>> print(P)
[55]
Summary
In this tutorial you learned what the Pandas head function is used for and how to use it. First, you learned the syntax. Then, you learned how to use the head function on both pandas dataframes and series. Then, you learned how to use the pandas tail function. Finally, you learned how pandas can be used for machine learning in Python.
Next steps
If you're interested in learning more about the basics of Python, coding, and software development, check out our Coding Essentials Guidebook for Developers, where we cover the essential languages, concepts, and tools that you'll need to become a professional developer.
Thanks and happy coding! We hope you enjoyed this article. If you have any questions or comments, feel free to reach out to jacob@initialcommit.io.
References
- Pandas Documentation Head - https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.head.html
- Pandas Documentation Tail - https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.tail.html
- Pandas head - https://appdividend.com/2022/06/15/pandas-head/
Final Notes
Recommended product: Coding Essentials Guidebook for Developers