208 pandas Keywords

Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.

pandas is a Python library for PAN-el DA-ta manipulation and analysis, i.e. multidimensional time series and cross-sectional data sets commonly found in statistics, experimental science results, econometrics, or finance. pandas is implemented primarily using NumPy and Cython; it is intended to be able to integrate very easily with NumPy-based scientific libraries, such as statsmodels.

To create a reproducible pandas example:

Main Features:

  • Data structures: for 1 and 2 dimensional labeled datasets (respectively Series and DataFrames). Some of their main features include:
  • Automatically aligning data and interpolation
  • Handling missing observations in calculations
  • Convenient slicing and reshaping ("reindexing") functions
  • Categorical data types
  • Provide 'group by' aggregation or transformation functionality
  • Tools for merging/joining together data sets
  • Simple matplotlib integration for plotting and graphing
  • Multi-Indexing providing structure to indices that allow for representation of an arbitrary number of dimensions.
  • Date tools: objects for expressing date offsets or generating date ranges; some functionality similar to scikits.timeseries. Dates can be aligned to a specific time zone and converted/compared at-will
  • Statistical models: convenient ordinary least squares and panel OLS implementations for in-sample or rolling time series / cross-sectional regressions. These will hopefully be the starting point for implementing models
  • Intelligent Cython offloading; complex computations are performed rapidly due to these optimizations.
  • Static and moving statistical tools: mean, standard deviation, correlation, covariance
  • Rich User Documentation, using Sphinx

Asking Questions:

  • Before asking the question, make sure you have gone through the 10 Minutes to pandas introduction. It covers all the basic functionality of pandas.
  • See this question on asking good questions: How to make good reproducible pandas examples
  • Please provide the version of pandas, NumPy, and platform details (if appropriate) in your questions

Useful Canonicals:

Resources and Tutorials:


Source Info
Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow