GitXplorerGitXplorer
k

clumper

public
147 stars
15 forks
11 issues

Commits

List of commits on branch main.
Verified
075eead3f7fac994da1110a4f17b8745640ef6a9

Reader encodings (#97)

kkoaning committed 3 years ago
Verified
578a56ad94998575e3f5337a9c9d21f4e0dd6702

Flatmap (#95)

kkoaning committed 4 years ago
Verified
df12a70bd91ed8713c7960f9b074e9675c5479b2

Version 0.2.13 (#94)

kkoaning committed 4 years ago
Verified
181c3cd83234f27ca04046e07bf6db83d89ce3cd

Interrogate (#92)

kkoaning committed 4 years ago
Verified
6846bb18c07619bfbf743cbe210ac98fa9c440e2

Add python3.9 (#90)

kkoaning committed 4 years ago
Verified
9315fc74050a3aff9ff7c4f44630559abd2009ff

version 0.2.12 (#89)

kkoaning committed 4 years ago

README

The README file for this repository.

Clumper

A small python library that can clump lists of nested data together.

Part of a video series on calmcode.io.

Base Example

Clumper allows you to quickly parse through a list of json-like data.

Here's an example of such a dataset.

pokemon = [
    {'name': 'Bulbasaur', 'type': ['Grass', 'Poison'], 'hp': 45, 'attack': 49},
    {'name': 'Charmander', 'type': ['Fire'], 'hp': 39, 'attack': 52},
    ...
]

Given this list of dictionaries we can write the following query;

from clumper import Clumper

clump = Clumper.read_json('https://calmcode.io/datasets/pokemon.json')

(clump
  .keep(lambda d: len(d['type']) == 1)
  .mutate(type=lambda d: d['type'][0],
          ratio=lambda d: d['attack']/d['hp'])
  .select('name', 'type', 'ratio')
  .sort(lambda d: d['ratio'], reverse=True)
  .head(5)
  .collect())
What this code does line-by-line. This code will perform the following steps.
  1. It imports Clumper.
  2. It fetches a list of json-blobs about pokemon from the internet.
  3. It removes all the pokemon that have more than 1 type.
  4. The dictionaries that are left will have their type now as a string instead of a list of strings.
  5. The dictionaries that are left will also have a property called ratio which calculates the ratio between hp and attack.
  6. All the keys besides name, type and ratio are removed.
  7. The collection is sorted by ratio, from high to low.
  8. We grab the top 5 after sorting.
  9. The results are returned as a list of dictionaries.

This is what we get back:

[{'name': 'Diglett', 'type': 'Ground', 'ratio': 5.5},
 {'name': 'DeoxysAttack Forme', 'type': 'Psychic', 'ratio': 3.6},
 {'name': 'Krabby', 'type': 'Water', 'ratio': 3.5},
 {'name': 'DeoxysNormal Forme', 'type': 'Psychic', 'ratio': 3.0},
 {'name': 'BanetteMega Banette', 'type': 'Ghost', 'ratio': 2.578125}]

Documentation

We've got a lovely documentation page that explains how the library works.

Features

  • This library has no dependencies besides a modern version of python.
  • The library offers a pattern of verbs that are very expressive.
  • You can write code from top to bottom, left to right.
  • You can read in many json/yaml/csv files by using a wildcard *.
  • MIT License

Installation

You can install this package via pip.

pip install clumper

It may be safer however to install via;

python -m pip install clumper

For details on why, check out this resource.

There are some optional dependencies that you might want to install as well.

python -m pip install clumper[yaml]

Contributing

Make sure you check out the issue list beforehand in order to prevent double work before you make a pull request. To get started locally, you can clone the repo and quickly get started using the Makefile.

git clone git@github.com:koaning/clumper.git
cd clumper
make install-dev