GitXplorerGitXplorer
r

sklearn-pdtransform

public
12 stars
1 forks
0 issues

Commits

List of commits on branch master.
Unverified
192c8a8523ae56d26715af19be89663549ec56fd

Fix readme

rrubik committed 8 years ago
Unverified
e81f5f03a1385ea810571cb5d73b5fadd9b178f4

Remove md readme

rrubik committed 8 years ago
Unverified
226859e9f646fffb0c39fad252ddf77e9b9fac07

Initial commit

rrubik committed 8 years ago
Unverified
67dbb20a0e269212420362805a043bc3431d0974

Initial commit

rrubik committed 8 years ago

README

The README file for this repository.

sklearn-pdtransform

Installation:

.. code::

$ pip install pdtransform

A little package with a few transformers to work with Pandas dataframes in the Sklearn pipeline, which I found myself writing quite frequently. Example usage:

.. code:: python

from pdtransform import DFTransform, DFFeatureUnion

pipeline = Pipeline([ ('ordinal_to_nums', DFTransform(_ordinal_to_nums, copy=True)), ('union', DFFeatureUnion([ ('categorical', Pipeline([ ('select', DFTransform(lambda X: X.select_dtypes(include=['object']))), ('fill_na', DFTransform(lambda X: X.fillna('NA'))), ('one_hot', DFTransform(_one_hot_encode)), ])), ('numerical', Pipeline([ ('select', DFTransform(lambda X: X.select_dtypes(exclude=['object']))), ('fill_median', DFTransform(lambda X: X.fillna(X.median()))), ('add_features', DFTransform(_add_features, copy=True)), ('remove_skew', DFTransform(_remove_skew, copy=True)), ('find_outliers', DFTransform(_find_outliers, copy=True)), ('normalize', DFTransform(lambda X: X.div(X.max()))) ])), ])), ])

For more information read this blog post <https://signal-to-noise.xyz/post/sklearn-pipeline/>_.