GitXplorerGitXplorer
b

stdlib-stats

public
21 stars
2 forks
0 issues

Commits

List of commits on branch main.
Unverified
c5d0ce65e80354e79d52fd1f2155ed40812fdbbc

Merge branch 'main' of github.com:brettcannon/stdlib-stats

bbrettcannon committed 4 years ago
Unverified
0280e20fb1474a212070aa331b2f563a6527e104

Record the median and mean # of PRs per module

bbrettcannon committed 4 years ago
Unverified
0894dde64831d3666eb63f48256dcccbbd8cdf3d

Use "rb" to read files for json.load()

bbrettcannon committed 4 years ago
Unverified
0d6bd7bacf8b49d2718b1dd5e54209de343fd556

Add a stat for the number of open PRs involving the stdlib

bbrettcannon committed 4 years ago
Verified
8d2c4e43005df39f3d7acfe528eeef5ba3c7d962

Add Binder badge, configs and instructions (#1)

lluizirber committed 4 years ago
Unverified
03d3b5b40caa55857c1f0f31e49336c425db0b68

Tweak README

bbrettcannon committed 4 years ago

README

The README file for this repository.

stdlib-stats

Various statistics on Python's standard library.

See stats.ipynb for charts that show the data in various ways.

You can also run an interactive Jupyter session using Binder:

Binder

Organized data

stdlib.csv contains various details about the modules in the stdlib. The table is built using various JSON files found in this repository (discussed below). To tweak how various things are treated, you can edit the JSON files and run aggregate.py to update it accordingly.

The category_usage.csv counts the number of projects which use a specify module category. It also tallies all the commits the category is made up of.

The stats.ipynb is a Jupyter notebook which contains various charts that try to analyze the data from the CSV in various ways.

Raw data

Map module to public module

Public availability is (mostly) determined by documentation existing in Doc/library/.

private_modules.json maps public modules to any private modules they depend on. For modules that are "cheating" and using private modules directly instead of their equivalent public API, they not listed as a dependent (e.g. multiprocessing directly using _weakrefset instead of going through weakref).

Map file to module

Ignores Argument Clinic files and tests, but includes header files.

file_map.json maps module name to relative file paths in a git clone.

Modules required to start Python

required.json lists the modules required to start Python (based on python -v -S -c pass).

Usage of a module in the public

usage.json lists the modules used by the 4000 most downloaded projects over the past year on PyPI.

The list of projects is listed in top-pypi-packages-365-days.json as fetched from Top PyPI Packages. The projects are downloaded by isidentical/syntax_test_suite.

Grouped by category

categories.json groups modules by category accoring to the library index.

The __future__ module is specially treated and put in its own category.

Commit stats per file

commit_stats.json tracks the oldest, newest, and SHA hashes of all the commits made on a specific file.