banner



How To Create A Subset Of A Dataframe In Python

Technical Questions and Answers

  • Selected Reading
  • UPSC IAS Exams Notes
  • Developer's Best Practices
  • Questions and Answers
  • Effective Resume Writing
  • HR Interview Questions
  • Computer Glossary
  • Who is Who

Python Pandas – Create a subset and display only the last entry from duplicate values


To create a subset and display only the last entry from duplicate values, use the "keep" parameter with the 'last" value in drop_duplicates() method. The drop_duplicates() method removed duplicates.

Let us first create a DataFrame with 3 columns −

dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'BMW', 'Mercedes', 'Porsche'],'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Delhi', 'Hyderabad', 'Mumbai'],'UnitsSold': [85, 70, 80, 95, 55, 90]})

Removing duplicates and displaying last entry. Using keep parameter, we have set "last". Duplicate rows except the last entry will get deleted. We have considered a subset using the "subset" parameter −

dataFrame2 = dataFrame.drop_duplicates(subset = ['Car', 'Place'], keep ='last').reset_index(drop = True)

Example

Following is the code −

import pandas as pd  # Create DataFrame dataFrame = pd.DataFrame({'Car': ['BMW', 'Mercedes', 'Lamborghini', 'BMW', 'Mercedes', 'Porsche'],'Place': ['Delhi', 'Hyderabad', 'Chandigarh', 'Delhi', 'Hyderabad', 'Mumbai'],'UnitsSold': [85, 70, 80, 95, 55, 90]})  print"Dataframe...\n", dataFrame  # removing duplicates and displaying last entry # using keep parameter, we have set "last" # duplicate rows except the last entry will get deleted # considered a subset using the subset parameter dataFrame2 = dataFrame.drop_duplicates(subset = ['Car', 'Place'], keep ='last').reset_index(drop = True) print"\nUpdated DataFrame after removing duplicates...\n",dataFrame2

Output

This will produce the following output −

Dataframe...            Car       Place   UnitsSold 0          BMW       Delhi          85 1     Mercedes   Hyderabad          70 2  Lamborghini  Chandigarh          80 3          BMW       Delhi          95 4     Mercedes   Hyderabad          55 5      Porsche      Mumbai          90  Updated DataFrame after removing duplicates...            Car       Place   UnitsSold 0  Lamborghini  Chandigarh          80 1          BMW       Delhi          95 2     Mercedes   Hyderabad          55 3      Porsche      Mumbai          90

raja

Published on 22-Sep-2021 12:26:25

  • Related Questions & Answers
  • Python – Display only non-duplicate values from a DataFrame
  • Find and display duplicate values only once from a column in MySQL
  • Create a subset of non-duplicate values without the first duplicate from a vector in R.
  • Python - Remove duplicate values from a Pandas DataFrame
  • Python Pandas - Indicate duplicate index values except for the last occurrence
  • Python - Remove duplicate values from a Pandas Data Frame
  • Python Pandas - Create a subset by choosing specific values from columns based on indexes
  • Python Pandas - Return Index with duplicate values removed keeping the last occurrence
  • Select a value from MySQL database only if it exists only once from a column with duplicate and non-duplicate values
  • Python Pandas – Merge and create cartesian product from both the DataFrames
  • Python Pandas - Indicate duplicate index values
  • Python Pandas – Find unique values from a single column
  • MySQL select only duplicate records from database and display the count as well?
  • Python – Create a Subset of columns using filter()
  • Display the last two values from field with MongoDB

How To Create A Subset Of A Dataframe In Python

Source: https://www.tutorialspoint.com/python-pandas-create-a-subset-and-display-only-the-last-entry-from-duplicate-values

Posted by: eckmanonswity.blogspot.com

0 Response to "How To Create A Subset Of A Dataframe In Python"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel