Skip to content

SVs in some chromosomes are missing in nr_deletion files #44

@ansontcw

Description

@ansontcw

Hello,

I downloaded the latest version of all files in the directory /pub/dbVar/sandbox/sv_datasets/nonredundant/deletions.

However, it seems the SVs in chr 1, 2, 9, 10, 11, 12 and X are missing in some files (e.g. GRCh37.nr_deletions.tsv, GRCh37.nr_deletions.bed, GRCh37.nr_deletions.tsv, and GRCh38.nr_deletions.bed).

For example, when I checked the file using sed '1,2d' GRCh38.nr_deletions.tsv | cut -f1 | sort -k1,1V | uniq -c, it gives:
183462 3
215779 4
179941 5
186375 6
177445 7
156126 8
100796 13
102341 14
88040 15
103951 16
97261 17
84456 18
89493 19
73053 20
46757 21
51888 22
7323 Y
56 mt

But GRCh38.nr_deletions.pathogenic.tsv (which I believe is a subset of GRCh38.nr_deletions.tsv) contains SVs from all chromosomes:

sed '1,2d' GRCh38.nr_deletions.pathogenic.tsv | cut -f1 | sort -k1,1V | uniq -c
1126 1
1657 2
788 3
499 4
677 5
729 6
863 7
561 8
725 9
452 10
691 11
370 12
409 13
321 14
765 15
1415 16
1175 17
342 18
489 19
272 20
189 21
661 22
1847 X
76 Y
16 mt

Would be great if you could help update the files. Thank you!

Best,
Anson

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions