Clumper

A small python library that can clump lists of nested data together.

Part of a video series on calmcode.io.

Base Example

Clumper allows you to quickly parse through a list of json-like data.

Here's an example of such a dataset.

pokemon = [
    {'name': 'Bulbasaur', 'type': ['Grass', 'Poison'], 'hp': 45, 'attack': 49},
    {'name': 'Charmander', 'type': ['Fire'], 'hp': 39, 'attack': 52},
    ...
]

Given this list of dictionaries we can write the following query;

from clumper import Clumper

clump = Clumper.read_json('https://calmcode.io/datasets/pokemon.json')

(clump
  .keep(lambda d: len(d['type']) == 1)
  .mutate(type=lambda d: d['type'][0],
          ratio=lambda d: d['attack']/d['hp'])
  .select('name', 'type', 'ratio')
  .sort(lambda d: d['ratio'], reverse=True)
  .head(5)
  .collect())

What this code does line-by-line.

This code will perform the following steps.

It imports Clumper.
It fetches a list of json-blobs about pokemon from the internet.
It removes all the pokemon that have more than 1 type.
The dictionaries that are left will have their type now as a string instead of a list of strings.
The dictionaries that are left will also have a property called ratio which calculates the ratio between hp and attack.
All the keys besides name, type and ratio are removed.
The collection is sorted by ratio, from high to low.
We grab the top 5 after sorting.
The results are returned as a list of dictionaries.

This is what we get back:

[{'name': 'Diglett', 'type': 'Ground', 'ratio': 5.5},
 {'name': 'DeoxysAttack Forme', 'type': 'Psychic', 'ratio': 3.6},
 {'name': 'Krabby', 'type': 'Water', 'ratio': 3.5},
 {'name': 'DeoxysNormal Forme', 'type': 'Psychic', 'ratio': 3.0},
 {'name': 'BanetteMega Banette', 'type': 'Ghost', 'ratio': 2.578125}]

Documentation

We've got a lovely documentation page that explains how the library works.

Features

This library has no dependencies besides a modern version of python.
The library offers a pattern of verbs that are very expressive.
You can write code from top to bottom, left to right.
You can read in many json/yaml/csv files by using a wildcard *.
MIT License

Installation

You can install this package via pip.

pip install clumper

It may be safer however to install via;

python -m pip install clumper

For details on why, check out this resource.

There are some optional dependencies that you might want to install as well.

python -m pip install clumper[yaml]

Contributing

Make sure you check out the issue list beforehand in order to prevent double work before you make a pull request. To get started locally, you can clone the repo and quickly get started using the Makefile.

git clone git@github.com:koaning/clumper.git
cd clumper
make install-dev

clumper

Commits

Reader encodings (#97)

Flatmap (#95)

Version 0.2.13 (#94)

Interrogate (#92)

Add python3.9 (#90)

version 0.2.12 (#89)

README