.. _installation: ************ Installation ************ Installing HTBAC =========================== To install HTBAC, we need to create a virtual or conda environment . Open a terminal and run: .. code-block:: bash virtualenv $HOME/venv source $HOME/venv/bin/activate conda create -n venv python=2.7 source activate venv HTBAC uses the `Ensemble Toolkit `_ to execute ensemble-based workflows. Install the Ensemble Toolkit before installing HTBAC by accessing the `installation guide `_. Once the Ensemble Toolkit is properly installed, install HTBAC by running the following commands: .. code-block:: bash git clone https://github.com/radical-cybertools/htbac.git cd htbac pip install . Preparing the Environment ========================= HTBAC uses `RADICAL Pilot `_ as the runtime system. RADICAL Pilot can access HPC clusters remotely via SSH and GSISSH, but it requires (a) a MongoDB server and (b) a properly set-up passwordless SSH/GSISSH environment. MongoDB Server -------------- .. figure:: figures/hosts_and_ports.png :width: 360pt :align: center :alt: MongoDB and SSH ports. The MongoDB server is used to store and retrieve operational data during the execution of an application using RADICAL-Pilot. The MongoDB server must be reachable on **port 27017** from **both**, the host that runs the HTBAC application and the host that executes the MD tasks, i.e., the HPC cluster (see blue arrows in the figure above). In our experience, a small VM instance (e.g., Amazon AWS) works exceptionally well for this. .. warning:: If you want to run your application on your laptop or private workstation, but run your MD tasks on a remote HPC cluster, installing MongoDB on your laptop or workstation won't work. Your laptop or workstations usually does not have a public IP address and is hidden behind a masked and firewalled home or office network. This means that the components running on the HPC cluster will not be able to access the MongoDB server. A MongoDB server can support more than one user. In an environment where multiple users use Ensemble Toolkit, a single MongoDB server for all users / hosts is usually sufficient. **Install your own MongoDB** Once you have identified a host that can serve as the new home for MongoDB, installation is straight forward. You can either install the MongoDB server package that is provided by most Linux distributions, or follow the installation instructions on the MongoDB website: http://docs.mongodb.org/manual/installation/ **MongoDB-as-a-Service** There are multiple commercial providers of hosted MongoDB services, some of them offering free usage tiers. We have had some good experience with the following: https://mlab.com/ .. _ssh_gsissh_setup: Setup passwordless SSH Access to machines ----------------------------------------- In order to create a passwordless access to another machine, you need to create a RSA key on your local machine and paste the public key into the `authorizes_users` list on the remote machine. `This `_ is a recommended tutorial to create password ssh access. An easy way to setup SSH access to multiple remote machines is to create a file ``~/.ssh/config``. Suppose the url used to access a specific machine is ``foo@machine.example.com``. You can create an entry in this config file as follows: .. code-block:: bash # contents of $HOME/.ssh/config Host machine1 HostName machine.example.com User foo Now you can login to the machine by ``ssh machine1``. Source: http://nerderati.com/2011/03/17/simplify-your-life-with-an-ssh-config-file/ Setup GSISSH Access to a machine --------------------------------- Setting up GSISSH access to a machine is a bit more complicated. We have documented the steps to setup GSISSH on `Ubuntu `_ (tested for trusty and xenial) and `Mac `_. Simply execute all the commands, see comments for details. The above links document the overall procedure and get certificates to access XSEDE machines. Depending on the machine you want to access, you will have to get the certificates from the corresponding locations. In most cases, this information is available in their user guide. Troubleshooting ======================= **Missing virtualenv** This should return the version of the RADICAL-Pilot installation, e.g., `0.X.Y`. If virtualenv **is not** installed on your system, you can try the following. .. code-block:: bash wget --no-check-certificate https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.9.tar.gz tar xzf virtualenv-1.9.tar.gz python virtualenv-1.9/virtualenv.py $HOME/myenv source $HOME/myenv/bin/activate **TypeError: 'NoneType' object is not callable** Note that some Python installations have a broken multiprocessing module -- if you experience the following error during installation:: Traceback (most recent call last): File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs func(*targs, **kargs) File "/usr/lib/python2.7/multiprocessing/util.py", line 284, in _exit_function info('process shutting down') TypeError: 'NoneType' object is not callable you may need to move to Python 2.7 (see http://bugs.python.org/issue15881).