-
All images from sub directory of images are copied into
all_images/
folder withcopy_images.py
. -
change csv file name from
Data_Entry_2017_v2020.csv
todata.csv
-
make a simpler
data2.csv
withsimplify_csv.py
: remove columns, rename columns, ignore "No Finding" in "Finding Labels"<!-- original columns --> Image Index,Finding Labels,Follow-up #,Patient ID,Patient Age,Patient Gender,View Position,OriginalImage[Width,Height],OriginalImagePixelSpacing[x,y], <!-- columns removed --> Image Index,Finding Labels,Patient Age,Patient Gender,View Position <!-- renamed --> name,label,age,gender,position <!-- image names changed from png to jpg -->
-
make
db.sqlite3
formdata2.csv
withcsv_to_sqlite.py
-
1024x1024 png images from
all_images/
that exists indata2.csv
ignoring the extension of the image files are resized to 512x512 jpg images withresize_images.py
intoresized_images/
.⚠︎ this takes very long time e.g. 783s
m
cxr-data
public
0 stars
0 forks
0 issues
Commits
List of commits on branch main.Unverified
43e09bdb32c1084e4733d0efc4bff2b08c21a48eshorten kaggle url
mminho42 committed 14 days ago
Unverified
bd1003ef6bcdc9c37b7f6ed8648dd5c5a63ed391put back the box csv
mminho42 committed 14 days ago
Unverified
9abb1990bfe89dfee1ccbda656c9922a1975b6f4links
mminho42 committed 16 days ago
Unverified
07e5aed21e67bf48f72eee1b967073002a3e45e0distinct disease labels
mminho42 committed 17 days ago
Unverified
2d716d0d6606e45ba2387a88eed206d761851d42don't ignore db.sqlite3
mminho42 committed 17 days ago
Unverified
35508a94d76e0accd7864ab1171b01a6e629c554add sqlite making
mminho42 committed 17 days ago