GitXplorerGitXplorer
s

pypi-data

public
124 stars
8 forks
6 issues

Commits

List of commits on branch main.
Verified
5df30e7c24732bf46de5cf134e69e226b8f64b00

Don't use urllib3 2.x features

ssethmlarson committed 3 months ago
Unverified
37885e8b93eb705ff56866fa92383148bc8f0590

Add info about new `classifiers` table to readme

eedgarrmondragon committed 6 months ago
Unverified
9dcce18beb88c647e2d820085972cb085d27ec8b

Add `classifiers` table

eedgarrmondragon committed 8 months ago
Unverified
daece5889f07091d3f30fb954c22927d465dcccb

Update CI versions

hhugovk committed a year ago
Verified
275ef06768bf4fff8a9955d40947818775c41c93

Fix optional data from deps.dev

ssethmlarson committed a year ago
Unverified
b5c89605b3eb21a4523d6b4d2524a9f0fc9d1499

Add more data about upload time, vulns

ssethmlarson committed 2 years ago

README

The README file for this repository.

PyPI Data

Mostly up-to-date data about almost every package on PyPI

Get access to the database via GitHub releases.

$ gunzip pypi.db.gz
$ sqlite3 'pypi.db' 'SELECT * FROM packages LIMIT 10 OFFSET 1000;'

acid-vault|1.3.2|>=3.6|1|0|2021-01-21 04:37:10
acidcli|1.0.1|>=3.6|0|0|2021-01-21 04:37:10
acidfile|1.2.1||0|0|2021-01-21 04:37:10
acidfs|1||0|0|2021-01-21 04:37:10
acidoseq|1.3.7||0|0|2021-01-21 04:37:10
acinonyx|0.1.0|>=3.6.0|0|0|2021-01-21 04:37:10
aciops|2.0.0|>=3.6|0|0|2021-01-21 04:37:10
acitoolkit|0.4||0|0|2021-01-21 04:37:10
ackeras|0.1.1||0|0|2021-01-21 04:37:10
ackg|0.0.5||0|0|2021-01-21 04:37:10

Data being tracked

  • Name, Version, Upload Time
  • Direct requirements (requires_dist)
  • Extra requirements (requires_extras)
  • Python requirements (requires_python)
  • Trove classifiers (classifiers)
  • Yanked versions (yanked)
  • Wheel data (build_tag, python_tags, abi_tags, platform_tags)
  • Maintainers on PyPI
  • URLs used by packages
  • OpenSSF scorecard data
  • Google Assured OSS

Database Schemas

-- Packages --
CREATE TABLE packages (
    name STRING,
    version STRING,
    requires_python STRING,
    yanked BOOLEAN DEFAULT FALSE,
    has_binary_wheel BOOLEAN,
    has_vulnerabilities BOOLEAN,
    first_uploaded_at TIMESTAMP,
    last_uploaded_at TIMESTAMP,
    recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    downloads INTEGER,
    scorecard_overall FLOAT,
    in_google_assured_oss BOOLEAN,
    PRIMARY KEY (name)
);

-- Dependencies --
CREATE TABLE deps (
    package_name STRING,
    extra STRING DEFAULT NULL,
    dep_name STRING,
    dep_specifier STRING,
    PRIMARY KEY (package_name, dep_name, dep_specifier)
);

-- Wheel data --
CREATE TABLE wheels (
    package_name STRING,
    filename STRING,
    build STRING,
    python STRING,
    abi STRING,
    platform STRING,
    uploaded_at TIMESTAMP,
    PRIMARY KEY (package_name, filename)
);

-- Maintainer data --
CREATE TABLE maintainers (
    name STRING,
    package_name STRING
);

-- Package URLs --
CREATE TABLE package_urls (
    package_name STRING,
    name STRING,
    url STRING,
    public_suffix STRING
)

-- OpenSSF Scorecard --
CREATE TABLE scorecard_checks (
    package_name STRING,
    name STRING,
    score INTEGER
)

-- Trove Classifiers --
CREATE TABLE classifiers (
    package_name TEXT,
    name TEXT,
    PRIMARY KEY (package_name, name),
    FOREIGN KEY (package_name) REFERENCES packages(name)
)

Download data

Downloads are grabbed from https://github.com/hugovk/top-pypi-packages but only available for the top 5,000 packages.

Running locally

$ docker build -t pypi-data .
$ docker run --rm pypi-data

License

Apache-2.0