方法一

登陆账号

1
2
3

curl 'https://signon.jgi.doe.gov/signon/create' --data-urlencode 'login=*****' --data-urlencode 'password=*****' -c cookies > /dev/null
# ****处修改为账号与密码

下载所有文件的列表

1	curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/get-directory?organism=PhytozomeV12' -b cookies > files.xml

1	https://genome.jgi.doe.gov

下载文件

files.xml文件里记录每个文件的大小、存放路径、md5、类型等
比如下面记录的是拟南芥的cds序列文件，其中的url=" “中的内容提取出来，”&“替换为”&"，前面加上网站https://genome.jgi.doe.gov，用curl下载（记得指定cookie文件）。

<file label=“PhytozomeV12” filename=“Athaliana_167_TAIR10.cds_primaryTranscriptOnly.fa.gz” size=“10 MB” sizeInBytes=“11041833” timestamp=“Wed Jan 08 16:38:08 PST 2014” url="/portal/ext-api/downloads/get_tape_file?blocking=true&amp;url=/PhytozomeV12/download/_JAMO/585474407ded5e78cff8c47a/Athaliana_167_TAIR10.cds_primaryTranscriptOnly.fa.gz" project="" library="" md5=“6085fd39ad3327c727838f9da4f4b222” fileType=“Assembly” />

下面是测试下载拟南芥的数据文件，对于批量下载来讲还是比较麻烦的，可以查看files.xml文件，
将这些curl 放到一个bash文件里也可以实现批量下载。

curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/get_tape_file?blocking=true&url=/PhytozomeV12/download/_JAMO/585474407ded5e78cff8c47a/Athaliana_167_TAIR10.cds_primaryTranscriptOnly.fa.gz' -b cookies > Athaliana_167_TAIR10.cds_primaryTranscriptOnly.fa.gz

curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/get_tape_file?blocking=true&url=/PhytozomeV12/download/_JAMO/587b0adf7ded5e4229d885ab/Athaliana_447_TAIR10.fa.gz' -b cookies > Athaliana_447_TAIR10.fa.gz

curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/get_tape_file?blocking=true&url=/PhytozomeV12/download/_JAMO/587b0ade7ded5e4229d885aa/Athaliana_447_Araport11.protein_primaryTranscriptOnly.fa.gz' -b cookies > Athaliana_447_Araport11.protein_primaryTranscriptOnly.fa.gz

curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/get_tape_file?blocking=true&url=/PhytozomeV12/download/_JAMO/587b0ade7ded5e4229d885a8/Athaliana_447_Araport11.gene.gff3.gz' -b cookies > Athaliana_447_Araport11.gene.gff3.gz

curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/get_tape_file?blocking=true&url=/PhytozomeV12/download/_JAMO/587b0adb7ded5e4229d885a1/Athaliana_447_Araport11.cds_primaryTranscriptOnly.fa.gz' -b cookies > Athaliana_447_Araport11.cds_primaryTranscriptOnly.fa.gz

方法二 | Get JGI Genomes

该方法适合批量下载

下载

1	git clone https://hub.fastgit.org/guyleonard/get_jgi_genomes.git

用法

Usage:
  get_jgi_genomes [-u <username> -p <password>] | [-c <cookies>] [-f | -a | -P 12 | -m 3] (-i) (-l) (-A) (-C) (-g) (-t) (-q)

Required:
	-u <username>
	-p <password>
or
	-c <cookie file>
Portal Choice:
	-f Mycocosm aka fungi
	-a Phycocosm aka algae
	-P <version> PhytozomeV aka plants
	-m <version> MetazomeV aka metazoans
Portal File Options:
	-A get assembly
	-C get CDS
	-g get GFF
	-t get transcripts
JGI Taxa ID:
	-i <id> JGI ID of Genome Project
Other:
	-l list only, no downloads

下载示例

# 登录：
./bin/get_jgi_genomes -u your.email@address.com -p password


# 登录后从 Mycocosm 下载所有蛋白质文件的列表：
./bin/get_jgi_genomes -c signon.cookie -f -l

# 登录后从 Phycocosm 下载所有 CDS 文件：
./bin/get_jgi_genomes -c signon.cookie -a -C

# 登录后从 Phytozome V12 下载所有程序集文件：
./bin/get_jgi_genomes -c signon.cookie -P 12 -A

方法三 | jgi-query

这是一个python写的脚本，感兴趣的可以查看使用信息，点击此处链接

下载

1	git clone https://github.com/glarue/jgi-query.git

使用

usage: jgi-query.py [-h] [-x [XML]] [-c] [-s] [-f] [-u] [-n RETRY_N]
                    [-l logfile] [-r REGEX] [-a]
                    [organism_abbreviation]

This script will list and retrieve files from JGI using the curl API. It will
return a list of all files available for download for a given query organism.

positional arguments:
  organism_abbreviation
                        organism name formatted per JGI's abbreviation. For
                        example, 'Nematostella vectensis' is abbreviated by
                        JGI as 'Nemve1'. The appropriate abbreviation may be
                        found by searching for the organism on JGI; the name
                        used in the URL of the 'Info' page for that organism
                        is the correct abbreviation. The full URL may also be
                        used for this argument (default: None)

optional arguments:
  -h, --help            show this help message and exit
  -x [XML], --xml [XML]
                        specify a local xml file for the query instead of
                        retrieving a new copy from JGI (default: None)
  -c, --configure       initiate configuration dialog to overwrite existing
                        user/password configuration (default: False)
  -s, --syntax_help
  -f, --filter_files    filter organism results by config categories instead
                        of reporting all files listed by JGI for the query
                        (work in progress) (default: False)
  -u, --usage           print verbose usage information and exit (default:
                        False)
  -n RETRY_N, --retry_n RETRY_N
                        number of times to retry downloading files with errors
                        (0 to skip such files) (default: 4)
  -l logfile, --load_failed logfile
                        retry downloading from URLs listed in log file
                        (default: None)
  -r REGEX, --regex REGEX
                        Regex pattern to use to auto-select and download files
                        (no interactive prompt) (default: None)
  -a, --all             Auto-select and download all files for query (no
                        interactive prompt) (default: False)