GitXplorerGitXplorer
b

stdlib-stats

public
21 stars
2 forks
0 issues

Commits

List of commits on branch main.
Unverified
69920c97bf5c150b27b9c6f893462c27098c9cb6

Reset execution counts

bbrettcannon committed 4 years ago
Unverified
3e1a875309df63d270455c2c7321f49a07e63a0d

Add some box and point charts for categories

bbrettcannon committed 4 years ago
Unverified
bd8e43842f69fea1f14a67c9771f19ca0a3801c5

Update for PR stats

bbrettcannon committed 4 years ago
Unverified
56aa4a9931b975ebb9054d9d7d86bc2d1d644261

Record of release dates

bbrettcannon committed 4 years ago
Unverified
843835ddefa6769f368debfe540e53923cec5097

Save PR file details

bbrettcannon committed 4 years ago
Unverified
6ed4c6a526731edad946769a5ce445a998485e65

Get file paths as part of the PR request for perf reasons

bbrettcannon committed 4 years ago

README

The README file for this repository.

stdlib-stats

Various statistics on Python's standard library.

See stats.ipynb for charts that show the data in various ways.

You can also run an interactive Jupyter session using Binder:

Binder

Organized data

stdlib.csv contains various details about the modules in the stdlib. The table is built using various JSON files found in this repository (discussed below). To tweak how various things are treated, you can edit the JSON files and run aggregate.py to update it accordingly.

The category_usage.csv counts the number of projects which use a specify module category. It also tallies all the commits the category is made up of.

The stats.ipynb is a Jupyter notebook which contains various charts that try to analyze the data from the CSV in various ways.

Raw data

Map module to public module

Public availability is (mostly) determined by documentation existing in Doc/library/.

private_modules.json maps public modules to any private modules they depend on. For modules that are "cheating" and using private modules directly instead of their equivalent public API, they not listed as a dependent (e.g. multiprocessing directly using _weakrefset instead of going through weakref).

Map file to module

Ignores Argument Clinic files and tests, but includes header files.

file_map.json maps module name to relative file paths in a git clone.

Modules required to start Python

required.json lists the modules required to start Python (based on python -v -S -c pass).

Usage of a module in the public

usage.json lists the modules used by the 4000 most downloaded projects over the past year on PyPI.

The list of projects is listed in top-pypi-packages-365-days.json as fetched from Top PyPI Packages. The projects are downloaded by isidentical/syntax_test_suite.

Grouped by category

categories.json groups modules by category accoring to the library index.

The __future__ module is specially treated and put in its own category.

Commit stats per file

commit_stats.json tracks the oldest, newest, and SHA hashes of all the commits made on a specific file.