This directory contains FASTA files which contain a modified version
of the Genome Reference Consortium human genome build 37 (hg19, Feb. 2009). 
The chromosomal sequences were assembled by the International Human 
Genome Project sequencing centers.  The hg19/GRCh37 assembly was changed 
to use IUPAC ambiguous nucleotide characters at each base covered by a 
stringently filtered subset of single-base substitutions annotated by 
dbSNP build 141.  For example, if the assembly has an 'A' at a position 
where dbSNP has annotated an A/C/T substitution SNP, the 'A' is replaced 
by 'H' in the FASTA file here.  

dbSNP single-base substitutions were excluded from masking in the
following cases:
- UCSC tagged the dbSNP item with any of these exceptions (see also the
  exceptions field of the hg19.snp141 database table as well as the
  hg19.snp141ExceptionDesc table):
  - MultipleAlignments: dbSNP mapped item to multiple locations
  - ObservedMismatch: the reference allele does not appear in the item's
    observed alleles.
  - ObservedWrongFormat: the observed sequence has an unexpected format
- dbSNP item class is not "single".
- dbSNP item length is not exactly one base.
- dbSNP item weight is greater than 1.  (lower weight = higher confidence)
The remaining single-base substitutions were used to mask the genomic 
sequence.

Files included in this directory:

chr*.subst.fa.gz - FASTA files with IUPAC characters for substitution SNPs

md5sum.txt - checksums of files in this directory

------------------------------------------------------------------
If you plan to download a large file or multiple files from this
directory, we recommend that you use ftp rather than downloading the
files via our website. To do so, ftp to hgdownload.cse.ucsc.edu
[username: anonymous, password: your email address], then cd to the
directory goldenPath/hg19/bigZips. To download multiple files, use
the "mget" command:

    mget <filename1> <filename2> ...
    - or -
    mget -a (to download all the files in the directory)

Alternate methods to ftp access.

Using an rsync command to download the entire directory:
    rsync -avzP rsync://hgdownload.cse.ucsc.edu/goldenPath/hg19/snp141Mask/ .
For a single file, e.g. chr1.subst.fa.gz
    rsync -avzP \
        rsync://hgdownload.cse.ucsc.edu/goldenPath/hg19/snp141Mask/chr1.subst.fa.gz .

Or with wget, all files:
    wget --timestamping \
        'ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/snp141Mask/*'
With wget, a single file:
    wget --timestamping \
        'ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/snp141Mask/chr1.subst.fa.gz' \
        -O chr1.subst.fa.gz

To uncompress the fa.gz files:
    gunzip <file>.fa.gz

[ICO]NameLast modifiedSizeDescription

[DIR]Parent Directory  -  
[TXT]README.txt29-Aug-2014 17:04 2.6K 
[   ]chr1.subst.fa.gz29-Aug-2014 16:24 75M 
[   ]chr2.subst.fa.gz29-Aug-2014 16:25 80M 
[   ]chr3.subst.fa.gz29-Aug-2014 16:25 66M 
[   ]chr4.subst.fa.gz29-Aug-2014 16:25 63M 
[   ]chr5.subst.fa.gz29-Aug-2014 16:25 60M 
[   ]chr6.subst.fa.gz29-Aug-2014 16:26 56M 
[   ]chr7.subst.fa.gz29-Aug-2014 16:26 52M 
[   ]chr8.subst.fa.gz29-Aug-2014 16:26 48M 
[   ]chr9.subst.fa.gz29-Aug-2014 16:26 40M 
[   ]chr10.subst.fa.gz29-Aug-2014 16:24 44M 
[   ]chr11.subst.fa.gz29-Aug-2014 16:24 44M 
[   ]chr12.subst.fa.gz29-Aug-2014 16:24 44M 
[   ]chr13.subst.fa.gz29-Aug-2014 16:24 32M 
[   ]chr14.subst.fa.gz29-Aug-2014 16:24 30M 
[   ]chr15.subst.fa.gz29-Aug-2014 16:24 27M 
[   ]chr16.subst.fa.gz29-Aug-2014 16:24 27M 
[   ]chr17.subst.fa.gz29-Aug-2014 16:25 26M 
[   ]chr18.subst.fa.gz29-Aug-2014 16:25 25M 
[   ]chr19.subst.fa.gz29-Aug-2014 16:25 18M 
[   ]chr20.subst.fa.gz29-Aug-2014 16:25 20M 
[   ]chr21.subst.fa.gz29-Aug-2014 16:25 12M 
[   ]chr22.subst.fa.gz29-Aug-2014 16:25 12M 
[   ]chrM.subst.fa.gz29-Aug-2014 16:26 6.4K 
[   ]chrX.subst.fa.gz29-Aug-2014 16:26 50M 
[   ]chrY.subst.fa.gz29-Aug-2014 16:26 8.1M 
[TXT]md5sum.txt29-Aug-2014 17:02 1.3K 

Apache/2.2.15 (CentOS) Server at hgdownload-test.sdsc.edu Port 80