How to Efficiently Upload or Download large datasets from WRDS:
Transferring large SAS datasets from WRDS can be time consuming, especially if you are located outside of the US. An efficient way to transfer data would be to compress it before the transfer. This usually takes three steps: 1. compressing the data on WRDS server; 2. transfer the data in SAS or with SSH Secure File Transfer Client; 3. Decompress the file on local machine. A more direct way would be use an unix utility called "rsync," which is a file synchronizer and a file transfer program. Here, I will detail the steps you can take to transfer files from/to WRDS easily and efficiently.
1. If you need to transfer files from/to your own PC, you need to setup a SSH server on your PC. You can use any SSH server for Windows. Personally, I use MobaSSH (it's free and can be downloaded from here. Choose the Home edition). After you download it, unzip it, copy into any folder, run the software, install, and start the MobaSSH Service). Now you have setup your own SSH server.
2. To access the SSH server, you need a SSH client. I highly recommend PuTTY (it's also free and can be downloaded from here. Choose the A Windows installer for everything except PuTTYtel, currently 0.64 version). Once you install it, you can access your own SSH server that you just set up using PuTTY. You will need to know your own IP (just google "what is my ip"), and the SSH port (22). You will be using the same username and password of your PC (if you don't know which username are usable, click on the Users tab in MobaSSH, and you can see the users).
3. Once you login your own SSH server. You can transfer data from/to WRDS by using rsync on your own SSH. Here are a few examples.
a. you want to transfer a dataset called stopptg.sas7bdat from WRDS's directory /wrds/ibes/sasdata/ to your own PC's D:\SSH:
rsync -zP [email protected]:/wrds/ibes/sasdata/stopptg.sas7bdat cygdrive/d/SSH
Note: The -z option is to compress during transfer. The -P option is to show program during transfer. Replace username with your own username on WRDS, you will be prompted to input your password on WRDS before transfer. The cygdrive is MobaSSH's link to your own PC. The cygdrive/d is linked to your D drive. As a comparison, when I transfer the file with (without) the compress option, it takes 33 seconds (2 minutes and 40 seconds) to transfer the file.
b. you want to transfer a dataset called stopptg.sas7bdat from your own PC's D:\SSH directory to WRDS's /home/school/username/ directory:
rsync -zP crydrive/d/SSH/stopptg.sas7bdat [email protected]:/home/schoo/username/
c. you want to transfer all files and sub-directories from WRDS's /home/school/username directory to your own PC's D:\SSH.
rsync -zrP [email protected]:/home/school/username /cygdrive/d/SSH
Note: The -r option copies all files and recurse into sub-directories.
Transferring large SAS datasets from WRDS can be time consuming, especially if you are located outside of the US. An efficient way to transfer data would be to compress it before the transfer. This usually takes three steps: 1. compressing the data on WRDS server; 2. transfer the data in SAS or with SSH Secure File Transfer Client; 3. Decompress the file on local machine. A more direct way would be use an unix utility called "rsync," which is a file synchronizer and a file transfer program. Here, I will detail the steps you can take to transfer files from/to WRDS easily and efficiently.
1. If you need to transfer files from/to your own PC, you need to setup a SSH server on your PC. You can use any SSH server for Windows. Personally, I use MobaSSH (it's free and can be downloaded from here. Choose the Home edition). After you download it, unzip it, copy into any folder, run the software, install, and start the MobaSSH Service). Now you have setup your own SSH server.
2. To access the SSH server, you need a SSH client. I highly recommend PuTTY (it's also free and can be downloaded from here. Choose the A Windows installer for everything except PuTTYtel, currently 0.64 version). Once you install it, you can access your own SSH server that you just set up using PuTTY. You will need to know your own IP (just google "what is my ip"), and the SSH port (22). You will be using the same username and password of your PC (if you don't know which username are usable, click on the Users tab in MobaSSH, and you can see the users).
3. Once you login your own SSH server. You can transfer data from/to WRDS by using rsync on your own SSH. Here are a few examples.
a. you want to transfer a dataset called stopptg.sas7bdat from WRDS's directory /wrds/ibes/sasdata/ to your own PC's D:\SSH:
rsync -zP [email protected]:/wrds/ibes/sasdata/stopptg.sas7bdat cygdrive/d/SSH
Note: The -z option is to compress during transfer. The -P option is to show program during transfer. Replace username with your own username on WRDS, you will be prompted to input your password on WRDS before transfer. The cygdrive is MobaSSH's link to your own PC. The cygdrive/d is linked to your D drive. As a comparison, when I transfer the file with (without) the compress option, it takes 33 seconds (2 minutes and 40 seconds) to transfer the file.
b. you want to transfer a dataset called stopptg.sas7bdat from your own PC's D:\SSH directory to WRDS's /home/school/username/ directory:
rsync -zP crydrive/d/SSH/stopptg.sas7bdat [email protected]:/home/schoo/username/
c. you want to transfer all files and sub-directories from WRDS's /home/school/username directory to your own PC's D:\SSH.
rsync -zrP [email protected]:/home/school/username /cygdrive/d/SSH
Note: The -r option copies all files and recurse into sub-directories.