Wednesday, February 24, 2010

Using Aspera instead of FTP to download from NCBI

If you often download large amounts of data from NCBI using their FTP site you might be interested in knowing that NCBI has recently started using the commercial software Aspera to improve download transfer speeds. This was announced in their August newsletter and at first was only for the Short Read Archive (SRA). However, I recently found out that they are now making all of their data available.

How to use it (web browser)
  1. Download and install the Aspera browser plugin software.
  2. Browse the Aspera NCBI archives.
  3. Click on the file or folder you want to download and choose a place to save it.
  4. The Aspera download manager should (see below) open and show the download progression.
How to use it (command line)
  1. The browser plugin also includes the command line program: ascp (In linux this is at: ~/.aspera/connect/bin)
  2. There are many options but the standard method is:
ascp -QT -i ../etc/asperaweb_id_dsa.putty anonftp@ftp-private.ncbi.nlm.nih.gov:/source_directory /destination_directory/

e.g.:
ascp -QT -i ../etc/asperaweb_id_dsa.putty anonftp@ftp-private.ncbi.nlm.nih.gov:/genomes/Bacteria/all.faa.tar.gz ~/

Critique
  • Windows machine with Firefox worked with no problems and download speeds at my institution were much faster than with FTP (~0.5 - 4.0Mbps vs 50-300kbps)
  • Browser plugin with Firefox on Linux would not work! Plugin seemed to be loaded properly, but Aspera download manager would not start. Update: This was due to me trying to install the plugin as root and causing a permission error. The plugin is installed in your home directory and must not be installed as root.
  • Download with command line in Linux was unreliable. This was a huge disappointment as this was the primary method I was hoping to use. Files would start to download correctly with very fast transfer speeds (1-4Mbps), but connection would drop with error: "Session Stop (Error: Connection lost in midst of data session)". Unfortunately, there is no way to resume the download so each time I had to start over. On about the 8th try it downloaded the file (6889MB) correctly. Update: see below
Personal Opinion
Although I was excited to see NCBI trying to improve data transfer speeds I was not very impressed with the Aspera solution. Hopefully, it will become more reliable in the future.
Of course, my personal solution would be for NCBI to embrace BitTorrent technology and make use of BioTorrents, but I will save that discussion for another day.


Update:
All ascp options are shown below (by typing ascp without arguments). However, I can't find any further documentation on these options. As noted in the comments below, -k2 is supposed to resume a download, but this didn't work for me when I tested it.
usage: ascp [-{ATdpqv}] [-{Q|QQ}] ...
[-l rate-limit[K|M|G|P(%)]] [-m minlimit[K|M|G|P(%)]]
[-M mgmt-port] [-u user-string] [-i private-key-file.ppk]
[-w{f|r} [-K probe-rate]] [-k {0|1|2|3}] [-Z datagram-size]
[-X rexmsg-size] [-g read-block-size[K|M]] [-G write-block-size[K|M]]
[-L log-dir] [-R remote-log-dir] [-S remote-cmd] [-e pre-post-cmd]
[-O udp-port] [-P ssh-port] [-C node-id:num-nodes]
[-o Option1=value1[,Option2=value2...] ]
[-E exclude-pattern1 -E exclude-pattern2...]
[-U priority] [-f config-file.conf] [-W token string]
[[user@]host1:]file1 ... [[user@]host2:]file2

-A: report version; -Q: adapt rate; -T: no encryption
-d: make destination directory; -p: preserve file timestamp
-q: no progress meter; -v: verbose; -L-: log to stderr
-o: SkipSpecialFiles=yes,RemoveAfterTransfer=yes,RemoveEmptyDirectories=yes,
PreCalculateJobSize={yes|no},Overwrite={always|never|diff|older},
FileManifest={none|text},FileManifestPath=filepath,
FileCrypt={encrypt|decrypt},RetryTimeout=secs

HTTP Fallback only options:
[-y 0/1] 1 = Allow HTTP fallback (default = 0)
[-j 0/1] 1 = Encode all HTTP transfers as JPEG files
[-Y filename] HTTPS key file name
[-I filename] HTTPS certificate file name
[-t port number] HTTP fallback server port #
[-x ]]

Update 2:
After spending an afternoon with Aspera Support, I have some answers to my connection and resume issues when using ascp. The problem has to do with me not using the -l option to properly limit the speed at which ascp sends data. I thought this limit would only be relevant if 1) I wanted to not use all of my available bandwidth or 2) my computer hardware could not handle the bandwidth of the file transfer. Surprisingly, the recent for my disconnects was because NCBI was trying to send more data than my bandwidth allowed and thus causing my connection to drop. I would have thought that ascp would look after these type of bandwidth differences considering that all other data transfer protocols that I know of can control their rate of data flow. If this is the case, it would suggest that my connection may be broken if for some reason my available bandwidth drops (which would happen often due to network fluctuations at a large institution) even if I set the limit appropriately. Hopefully, Aspera can make their data transfer method a little more robust in the future. I don't think I will be replacing ftp with ascp in my download scripts quite yet.

Update 3:
Michelle from Aspera finally let me know that -Q is default option I should be using to allow adaptive control. Now, I am trying to get a entire directory to download, but I am still having connection issues. Here is a screenshot of my terminal showing that the directory resume is not working and I am losing my connection:


Reblog this post [with Zemanta]

50 comments:

Cliff Beall said...

Thanks for this. It worked well with the Aspera plug-in in Safari in OS/X, about 20X faster than ftp. I may try some experimentation with the command line version- I have some possible future applications where scripting would probably be useful.

Morgan Langille said...

@Cliff, Yes the speed difference is quite amazing. Let me know if you have better results with the command line program.

Cliff Beall said...

I just tried a 10 MB and a 2 GB download from the command line and they were both ok. I did use a slightly different command, with a bandwidth limit: -l 200M.

I found that in some NCBI documentation in the 1000genomes folder on the ftp site- there is an Aspera_Users.README file and an aspera_transfer_guide.pdf there.

By the way, for other Mac users, Aspera put my ascp file in:
/Applications/Aspera Connect.app
/Contents/Resources/

It was actually inside the application file and you need to escape the space with a backslash to get the path to that. The .putty file was in the same directory.

John said...

If you type ascp without any argument you will see a small help output. To resume transfers use -k2. Also, the -l argument probably needs to be set to something you network and storage are capable of. A 7200rpm disk will probably maintain 150-200Mbps, so you could try -l200m

Morgan Langille said...

Yes, I was aware that typing ascp without any argument would show me all options. However, there is no information in the output to suggest that -k2 can be used to resume a download (any chance John works for Aspera?). In fact, I just tested that option and it didn't work for me at all.
I have edited my initial blog post to show the options for ascp.

In regards to the bandwidth limit, I had also previously tried that but I still got the same connection error. To be clear, my speeds were not even getting close to 200M (max was 35M) so setting that option seems like it would not have any effect. Also, who are you expecting would be able to have bandwidth speeds this high to NCBI?

John said...

I do work for Aspera. To get a complete response to what your problem is (there are a lot of potential issues) you should contact support@asperasoft.com, open a ticket and let them know you are doing transfers with NCBI. They will ask for your log files, which contain metrics we can look at to find out what the problems are. If you want to isolate log files (normally written to syslog) to send to support, the -L switch can be used to specify a log directory. For example:

$ ascp -i -QT -l 35m -k2 -L /tmp/aspera user@host:ncbifiles /data

The /tmp/aspera directory will need to exist, but after your transfer you will see some log files.

As for the -k2, that option specifies how to deal with partial files. The support team can also look at the logs to see why this did not work.

I have some ideas what is going on, but the support team is very helpful and the log files will help us identify the exact cause.

Morgan Langille said...

I have made an additional update to my original blog post explaining the reasons that Aspera gave me for my connection problems.

If others have experiences with the software it would be great to hear your opinions.

Michelle Munson said...

Morgan,

Your usage missed a fundamental capability in our software. The 'ascp' command line binary MUST have a -Q option in place to use the adaptive rate control. Without it, the software is using a fixed rate mode, without regard for the available bandwidth, and hence the potential for disconnecting your own transfers due to overdrive of connection. This is the sole reason for the disconnects you experienced and the fluctuating speed.

If you use the -Q option, the target transfer rate (-l) does not need to be adjusted at all. This was your primary problem.

I suppose we should make this the default in the 'ascp' command line (it is in the standalone products) but not in the command line.

Morgan, additionally, we are also releasing our 2.6 version of the software with disk-based rate control enabled. This is important for very high speed transfers where the network bottleneck speeds exceeds the disk throughput - basically extends the congestion avoidance of the adaptive rate control through the disk. At the speeds you tested, this is not important, but for other NCBI users it is.

I am taking time to reply on this because it is extremely important that users understand how to properly use the software, to realize its intended performance.

Thanks,

Michelle Munson
President, Aspera, Inc.

Michelle

Michelle Munson said...

Morgan,

I would like to have you run the following command to demonstrate our points:

ascp -QT -l 200M -k2 -i ../etc/asperaweb_id_dsa.putty anonftp@ftp-private.ncbi.nlm.nih.gov:/source_directory /destination_directory/

This will do the following:

- Use Adaptive rate control and automatically adjust the transmission rate to the available bandwidth, which is around 35 Mbps REGARDLESS of the target rate (-l 200M, 200 Mbps). You were using Fixed rate control (no -Q) before, which fixed the transmission rate at 200 Mbps.

Adaptive rate control is the automatic adjustment to available bandwidth that you mentioned 'ascp' should do.

- If you interrupt the transfer and restart it, will resume the transfer from the point of interruption (-k2 flag).

The default usage is to not resume, to overwrite files at the destination. This is documented in the command line usage, 'man ascp'.

Thank you,
Michelle

Morgan Langille said...

Hi Michelle,

Thanks for your comments. You are the third person from your company to contact me about how to fix my problem. I am surprised that your support person didn't mentioned that I should use the -Q option?!

You suggest looking at the man page, but there is no man page documentation included with the "Aspera Connect" package. There may be a man page with the "Aspera Client", but this is not available to download by the general public (username & password is requested).

Also, the -K2 option only works if the -l option is used.

Considering the large number of users that will eventually be using the ascp command line program it would be great if many of these options were default or better explained.

Morgan Langille said...

As a further comment, there seems to be some limitations with downloading directories with the command line. The files start to download, but then get a permission denied error on hitting the file ".a.swp". That is fine since I don't want that file, but ascp will error and not continue onto the next file.

Also, it seems that I can't use a "*" to only get some of the files in a directory, which I can do with FTP.

Michelle Munson said...

Morgan,

Ric Mackie in our support team did in fact tell you to use the -Q flag when downloading to engage adaptive rate control.

Regarding the man page inclusion with 'ascp', we do not include the man page with the Connect client installer because it is intended to be used as a browser plug-in application. You are using the contained 'ascp' binary (which is fine) but that is typically done when a user has installed our desktop client package, Aspera client. This package (rpm or deb) includes the man page.

Regarding the permission denied errors when transferring these files, ascp uses the native file system permissions to determine if it can read or write files. Are you certain that the user account context in which the 'ascp' process is running has access to read and/or write the special files in question (on the source or destination, whichever applies)?

Regarding support of glob matches (*), great point. We are actually adding support for that in our 2.6 release series. I completely agree ascp should support it.

Regarding -k2 depending on the use of the -l option, I haven't noticed that myself, but if -l is not specified at the command line, ascp uses a default target rate of 10 Mbps (which limits the transfer rate to 10 Mbps) on top of the automatically adapted rate -- probably not what you want. You will want to include a "-l" in your command line options that is as high as you would ever want the transfer to be, e.g. -l 200M in your case.

I realize there are a number of command line options, but it is worth learning them to get the results you want. Users of our GUI client products don't have to worry about any of these -- they are built in to the application.

Thanks for all of your feedback.

Michelle

Michelle Munson said...

Morgan,

'ascp' does in fact resume a directory download even if no -l is specified, for example try the following:

ascp -TQ -k2 asperaweb@demo.asperasoft.com:aspera-test-dir-large .

Terminate the transfer part-way through.

Then repeat the command. The transfer will pick up from where it left off. Any files previously transferred are "skipped" and the transfer resumes from within the file where it was interrupted.

Michelle

Morgan Langille said...

Ric Mackie in our support team did in fact tell you to use the -Q flag when downloading to engage adaptive rate control.

No, he didn't. I told me to make sure I set the -L limit to the max bandwidth or just below to make sure the packets don't get backed up due to me not having enough bandwidth. No mention of -Q.

Regarding the man page inclusion with 'ascp', we do not include the man page with the Connect client installer because it is intended to be used as a browser plug-in application. You are using the contained 'ascp' binary (which is fine) but that is typically done when a user has installed our desktop client package, Aspera client. This package (rpm or deb) includes the man page.

So why not allow everyone to download this client? Would have saved me quite a bit of time.

Regarding the permission denied errors when transferring these files, ascp uses the native file system permissions to determine if it can read or write files. Are you certain that the user account context in which the 'ascp' process is running has access to read and/or write the special files in question (on the source or destination, whichever applies)?


Yes, I probably don't have permission to read that particular file. That is fine. The problem is that ascp seems to have problems continuing on afterwards


Regarding -k2 depending on the use of the -l option, I haven't noticed that myself, but if -l is not specified at the command line, ascp uses a default target rate of 10 Mbps (which limits the transfer rate to 10 Mbps) on top of the automatically adapted rate -- probably not what you want. You will want to include a "-l" in your command line options that is as high as you would ever want the transfer to be, e.g. -l 200M in your case.


Actually suggested that the setting of -l 200 M was the reason -K2 wasn't working.

I realize there are a number of command line options, but it is worth learning them to get the results you want. Users of our GUI client products don't have to worry about any of these -- they are built in to the application.


The whole point of me making this blog post was to provide information about how to use Aspera to get data from NCBI. Considering there is no documentation anywhere about the command line program, I thought the default settings would at least allow me to download a few files without problems


'ascp' does in fact resume a directory download even if no -l is specified, for example try the following:

ascp -TQ -k2 asperaweb@demo.asperasoft.com:aspera-test-dir-large .

Terminate the transfer part-way through.

Then repeat the command. The transfer will pick up from where it left off. Any files previously transferred are "skipped" and the transfer resumes from within the file where it was interrupted.


This did not work for me. I have attached a picture to my blog to show my problem.

Michelle Munson said...

Morgan,

In reply, not necessarily in order ....

Ric Mackie feels very concerned because he did tell you to use the -Q option, but the important point is simply to note the option for your usage going forward.

Regarding the main Aspera Client, it is available for use but only with a purchased license, whereas the Aspera Connect included on NCBI's site is freely distributed as part of the NCBI license. Some NCBI partners use the main Aspera Client in lieu of Connect.

The Connect web client is intended to be used a browser plug-in, not command line, and its documentation is geared accordingly: http://download.asperasoft.com/download/docs/connect/2.3/aspera-connect-linux.html

Regarding documentation of the 'ascp' command line usage, you will find complete documentation for download on our web site for all products for which the 'ascp' command line is an intended usage.

See, for example, Aspera Client ascp command line usage:

http://download.asperasoft.com/download/docs/scp_client/2.5/aspera-client-unix.html#ascp-usage

NCBI has a login and password to access this. Here is a temporary login and password you can use:

Login: tmpmarsw
Password: a8mbnu88

Last year we also provided this to some of the folks at NCBI to publish as part of the FAQ. Please feel free to add this info for your users.

Regarding your problem with the download not resuming when specifying -k2, I can not see the error picture. It is possible that if you used fixed rate (no -Q) and a 200 Mbps target rate on 35 Mbps connection that the severe overdrive would cause the transfer to fail rather than complete the resume checks.

That said, by adding -Q to engage adaptive rate control, the transfer session will not overdrive and there will be no problem in resuming transfers.

Regarding the permission denied error causing ascp to not continue transferring, that is actually not the behavior at all. Please drop me your log file (/var/log/messages) and I will take a look to determine the root cause.

Thanks,
Michelle

Michelle Munson said...

Morgan,

I was able to zoom in on the picture you provided and can see the error. This sort of "Connection lost" error shortly into the transfer precisely means that one end of the transfer was not able to receive UDP traffic on the FASP port from the other end for 10 seconds (if the session has not been fully established) or for more than 60 seconds if the session has been fully established.

This error is terminating the connection and preventing the resume check and the progress of the transfer.

There is more than one possible root cause, and can be determined from the transfer log files.

Would you please zip and upload the /var/log/messages file to demo.asperasoft.com You can use Unix scp with user 'support' and password 'demoaspera'.

Please email me once available and I will review and reply asap.

Thanks,
Michelle

Morgan Langille said...

Michelle,

I think many of these problems could have been avoided if NCBI or Aspera posted documentation for the ascp program. If the documentation is not freely available maybe there should be a warning that it is not for command line use?

I think many users would like to use the Aspera software to speed up their downloads, but for some reason NCBI is not providing much information about it or that it even exists. Maybe NCBI is still in a "test" phase of using your software?

Anyway, thanks for your help so far, I think we are making some forward progress.

I have uploaded my /var/log/messages file to your server, in hopes of you determining my connection issues.

Michelle Munson said...

Morgan,

I am glad we were able to get to the bottom of the three issues you were facing. Hopefully they can be of help to other users:

1. A rate limiting cap in your network on UDP traffic that strictly limits the transmission rate for Aspera FASP.

2. The use of the skip special files option at the command line made it appear as if the transfer was not making progress when in fact it was actually skipping multiple special files in a row.

3. The permission denied error caused by the source file permissions being propagated to the destination files, unless you request an alternate directory mask through the ascp configuration file.

Please let us know if you have any other questions. Hopefully you will be able to share your ongoing experiences with your readers.

Michelle

Alex Lash said...

About a week ago, I started a 12 TB (many files) download from NCBI using ascp with the following options:

-QTr -l 300M -k 1

For a couple of days it was downloading at about 2 GB/min. After about 5.5 TB, however, I got timeout errors, and the process stopped.

A day later, after reading this blog, I restarted the download using the -k 2 option.

ascp immediately began deleting incomplete files so that I went from 5.5 TB down to 2.4 TB.

It doesn't seem to be adding any new files now, and I'm seeing many of the following errors in the messages log:

ERR rex_add: rex buffer full

Any help anyone could provide would be greatly appreciated.

Morgan Langille said...

Hi Alex,

12TB, that is a ton of data. I have been trying to just get all of GenBank downloaded (300GB) and have been failing miserably. I have tried FTP and also using Aspera (web plugin and ascp). Aspera is faster, but I will still lose my connection at some point. The problem with resuming either FTP or Aspera is that it takes so long for the file checking, that often by the time to starts to download again I get another disconnect.

As to your specific error with Aspera, I have no idea since I don't work for them.

If you happen to have a recent download of GenBank (still compressed and untouched) please add it to BioTorrents. These disconnect issues would not be a problem if NCBI used BitTorrent to distribute their bigger datasets.

Zach Stednick said...

Thanks for this post, I have been having similar issues with getting data out of dbGAP. Unfortunately your post did not help address what I needed but it was informative and I did learn quite a bit about aspera which hopefully will help me in the future. More than anything, it was good to see that I am not the only person dealing with similar issues.

Anonymous said...

It is horrible that NCBI uses aspera. My tax money is wasted totally.

terrible interface, complicated command switches, no 'about' in the firefox plug-in interface. Staying in status 'connecting' forever, no explaining for any error.

Learn something from wget please!!!

Unknown said...

I just gave ascp a spin and it worked on first try. At least 20 times faster than wget, wow!

On linux installation was trivial and I'm using the command line tool as follows:

/home/www/.aspera/connect/bin/ascp -QT -l200M -i /home/www/.aspera/connect/etc/asperaweb_id_dsa.putty anonftp@ftp-private.ncbi.nlm.nih.gov:/blast/db/nr.*.tar.gz ./

Haven't tried the plug-in and probably never will.

Bart Hazes said...

Worked great on first try. At least 20 times faster than wget. I'm using the command line tool as follows:

/home/www/.aspera/connect/bin/ascp -QT -l200M -i /home/www/.aspera/connect/etc/asperaweb_id_dsa.putty anonftp@ftp-private.ncbi.nlm.nih.gov:/blast/db/nr.*.tar.gz ./

This really makes a difference!

Anonymous said...

This link contains vital information that will assist you to resolve all outstanding issues with ASCP and Aspera line of products.

http://www.asperasoft.com/en/company/luke_4/Luke_4

Randy Short said...

In what ways is Aspera the same as Bittorrent? What are the architectural similarities?

Anonymous said...

Hi Michelle

Great Post was getting 'Session Stop (Error: Session data transfer timeout (server), Peer Error: Session'

Applied your suggestion with the -TQ and -k2 flags

ascp -i "//home/.ssh/id_dsa" -TQ -k2 // ://

and it worked a treat no errors :-)

Thanks

Zahid Ali
(Pearson UK)

Harry Mangalam said...

It is now quite a bit after this original thread, but dealing with Aspera is still difficult.

I've been trying for about 3 weeks to get about 30TB of data from the Broad Inst to UC Irvine via Aspera using the Linux commandline ascp client. I tried convincing both the Broad and my client that they'd be better off buying 10 3TB disks and re-using them for such transfer but no go..

The recieve node is a 64core node with a single 1Gb connection to the public Internet via CENIC, thru a pipe that is relatively quiet. It should be able to carry at least 80MB/s to out 10Gb backbone which runs to CENIC. It is running at about 60MB/s, most of it via ascp, thru 5 parallel ascp connections.

Inspired by reading this blog (Thanks!) I tried a serial connection:

/root/.aspera/connect/bin/ascp -O 33005 -QT -l 500M -k2 user@xxx.broadinstitute.org:/xxx /xxx/yyy/broad

(this was canceled after I realized that to pick up the transfer again, it would have to read thru the 12 TB that I've already transferred. It appears to be doing something like an rsync, which may be the only way to get it to work, but that's a big chunk of disk activity..

The 4 parallel transfers are:

/root/.aspera/connect/bin/ascp -C 1:5 -O 33001 -l 1000m user@xxx.broadinstitute.org:/xxx /xxx/yyy/broad
/root/.aspera/connect/bin/ascp -C 2:5 -O 33002 -l 1000m user@xxx.broadinstitute.org:/xxx /xxx/yyy/broad
/root/.aspera/connect/bin/ascp -C 3:5 -O 33003 -l 1000m user@xxx.broadinstitute.org:/xxx /xxx/yyy/broad
/root/.aspera/connect/bin/ascp -C 4:5 -O 33004 -l 1000m user@xxx.broadinstitute.org:/xxx /xxx/yyy/broad

(and this one was started after the serial one was canceled - see above)
/root/.aspera/connect/bin/ascp -C 4:5 -O 33004 -l 1000m user@xxx.broadinstitute.org:/xxx /xxx/yyy/broad

Note in the last command, the option '-C 4:5 -O 33004' is identical between the last 2 commands. For some reason the obvious increment '-C 5:5 -O 33005' would not work - the ssh connection was refused. They all appear to be working (and all commands claim to be working on the same file 'MH0131639.bam', altho it's impossible to see what's happening to that file since ascp apparently defines the eventual size of the file and (at least on a gluster filesystem), it is always that size; it does not increment as data comes in:

1101 $ ls -lh MH0131639.bam
-rw-r--r-- 1 root root 147G Apr 30 16:55 MH0131639.bam

The size of the file in the separate windows does not equal the size of the segments if that's what it is supposed to indicate: those sizes at this point are:

14GB + 44GB + 7GB5 + 101GB + 91GB (= 325GB, much larger than the total size of the file.

So, parts of it work. It has the above mentioned parallel option that I've been trying for about a week and the main problem is that (as others have noted) it keeps dropping the connection. Not only does it drop the connection, but the ascp process keeps running so a monitor script cannot tell that the process has dropped (altho I recently discovered that if you add add a few [Enter]s after the one for the password, it will drop out of the process so a monitor script can now tell that one process has stopped.)

The transfer speed seems to be about 10MBytes per connection. Not bad, but not much better than my experience with bbcp , which is both free, quite reliable, and whose author is quite responsive to suggestions. Maximum connection over all ascp connections is about 50-60 MBytes/s.

If it keeps working (and it often doesn't), it will transfer data, but I don't see much improvement over transfer rates that bbcp or better, gridFTP (and especially its consumer interface Globus Online which can move data extraordinarily fast (and with less fuss).

Why the Broad Institute and especially the NIH would use a commercial 'solution' that seems to still be in an alpha stage when there are free solutions that demonstrably work better, I'm not sure.

Harry Mangalam said...

I have incorporated a short segment about Aspera's ascp into my "How to transfer large amounts of data via network."

Corrections happily accepted.

Harry Mangalam said...

OK..
how about this link?

Bob Bae said...

A good alternative to Aspera is a tiny open source program called xc. It is at http://github.com/speedops/xc

Sucheta Tripathy PI @ Computational Genomics Group at IICB, Kolkata said...

NCBI says aspera is free but in aspera site it is asking for subscription. Can you please direct me where I can download aspera (ascp) for free?
Many thanks

Anonymous said...

Check out SuperTCP (supertcp.com). They sound like Aspera, but can operate directly on TCP, so you don't have to totally change your workflow to get faster transfers!

Anonymous said...

Hi Morgan
I am a new user of Aspera, and thanks for your blog.
I tried to type command "ascp -QT -i ~/asperaweb_id_dsa.putty era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/SRR345/SRR346368/SRR346368.fastq.gz" and then terminal told me that "ascp:'ascp-license' could not be found, existing.
But I have copied "ascp-license" file from .aspera/connect/etc to /usr/local/bin, I don't know why PC can't find this file.
If you have some suggestion, please let me know.
Thanks again.

Unknown said...

##DOWNLOADING BUNCH OF FILES FROM NCBI

Since I had hard time to figure out or download list of files/databases using --file-list flag, I thought I can post the command here which may help some people.

if you want to download a bunch of files, write a file that lists the required databases/files with path i.e.

/blast/db/env_nt.04.tar.gz
/blast/db/env_nt.05.tar.gz
/blast/db/env_nt.06.tar.gz

and then use the following command to download all these files using one command.

~/.aspera/connect/bin/ascp -l640M -T -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh --user=anonftp --host=ftp.ncbi.nlm.nih.gov
--mode=recv --file-list=filelist DestinationFolderwithPATH

I hope this is helpful.

Muhammad

Anonymous said...

It's 2016 and Aspera still sucks.

No globbing. No sensible logs or error messages. Still difficult to get hold of. Switches still are overly complicated and not particularly powerful. What a shame that transferring data is still a thing bioinformatics struggles with.

Unknown said...


Thank you.. This is very helpful. .Tableau Online Training

Jack sparrow said...

I am a regular reader of your blog and I find it really informative. Hope more Articles From You.Best Tableau tutorial videos available Here. hope more articles from you.

nakshatra said...

You re in point of fact a just right webmaster. The website loading speed is amazing. It kind of feels that you're doing any distinctive trick. Moreover, The contents are masterpiece. you have done a fantastic activity on this subject!
tableau certification
360DigiTMG

Data Science Training said...

Amazing article with very useful information thanks you sharing waiting for next update.
Data Science Training in Hyderabad 360DigiTMG

Data Science said...

Highly recommendable blog with great resource thank you.
360DigiTMG Data Analytics Training

EXCELR said...

This is a really explainable very well and i got more information from your site.Very much useful for me to understand many concepts and helped me a lot.Best data science courses in hyerabad

EXCELR said...

"Thanks for the Information.Interesting stuff to read.Great Article.
I enjoyed reading your post, very nice share.data science training"

Maneesha said...

You need to be a part of a contest for one of the best websites online. I’m going to recommend this website!
data scientist training and placement in hyderabad

traininginstitute said...

A great website with interesting and unique material what else would you need.
data scientist training in malaysia

traininginstitute said...

We are really grateful for your blog post. You will find a lot of approaches after visiting your post. Great work
data scientist course

Ramesh Sampangi said...

Nice blog and informative content. Really useful for many people, I bookmarked your website for further blogs. Thanks, you.
Data Science Course in Hyderabad

Laurent Martin said...

For a much better command line with included automatic resume and proper configuration file use:
https://github.com/IBM/aspera-cli

Career Academic institute said...

Data Science has specific deliverables and goals that include it. These deliverables assist in addressing the objectives of fixing the issue at hand.

Data Science in Bangalore

Career Program and Skill Development said...

Data Scientist is the top job in the market, as it has promising career growth and high salary packages. Start your preparation with the best Data Science training Institute 360DigiTMG today and become a successful Data Scientist.

Data Science Training in Jodhpur