Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.
pandas is a Python library for PAN-el DA-ta manipulation and analysis, i.e. multidimensional time series and cross-sectional data sets commonly found in statistics, experimental science results, econometrics, or finance. pandas is implemented primarily using NumPy and Cython; it is intended to be able to integrate very easily with NumPy-based scientific libraries, such as statsmodels.
Provide 'group by' aggregation or transformation functionality
Tools for merging/joining together data sets
Simple matplotlib integration for plotting and graphing
Multi-Indexing providing structure to indices that allow for representation of an arbitrary number of dimensions.
Date tools: objects for expressing date offsets or generating date ranges; some functionality similar to scikits.timeseries. Dates can be aligned to a specific time zone and converted/compared at-will
Statistical models: convenient ordinary least squares and panel OLS implementations for in-sample or rolling time series / cross-sectional regressions. These will hopefully be the starting point for implementing models
Intelligent Cython offloading; complex computations are performed rapidly due to these optimizations.
Static and moving statistical tools: mean, standard deviation, correlation, covariance