User Tools

Site Tools


Sidebar

nextnanomat

Support System

nextnano GmbH

intern

nnm:cloud_computing

nextnano.cloud

Option A: HTCondor

You can use HTCondor to run the nextnano software on your local computer infrastructure (“on-premise”). Essentially, the nextnanomat software submits the job either locally or on the “HTCondor” cluster.

This feature is only supported with our new license system (*.lic)!

Download recommended HTCondor installer from HTCondor. Click on Download and download Current Stable Release (e.g. HTCondor 8.6.12) from UW Madison. We recommend the Native Package for Windows.

  • Latest tested version: 8.6.12: condor-8.6.12-446077-Windows-x64.msi
  • Latest tested version: 8.6.11: condor-8.6.11-440910-Windows-x64.msi

File (Click to download). When you download you have to enter your name, email address and institution.

  1. Start installer
  2. Click Next and accept License Agreement
  3. Then there are two options. There will be one special computer that manages all condor jobs (Central Manager), and normal computers. If there is no Central Manager yet, we have to create a New Pool.
    1. If you are on the Central Manager, choose Create a new HTCondor Pool and fill in the name of the Pool, e.g. nextnanoCondorPool. This is a unique name for your pool of machines.
    2. If you are not the Central Manager, choose Join an existing HTCondor Pool and fill in the hostname of the central manager, e.g. computername where nextnanoCondorPool has been created.
  4. Tic Submit jobs to HTCondorPool and choose Always run jobs and never suspend them. (Alternative: If you do not want other people to run jobs on your machine at all, select Do not run jobs on this machine or if you do not want other people to run jobs on your machine while you are working, select When keyboard has been idle for 15 minutes.. You can of course modify these settings later.)
  5. Fill in your domain name (Example: Your Windows domain, e.g. yourcompanyname.com (without www).) Leave it blank if you are unsure.
  6. Hostname of SMTP Server and email address of administrator (not needed currently, leave it blank)
  7. Path to Java Virtual Machine (not needed currently, leave it blank)
  8. Host with Read access *
  9. Host with Write access $(CONDOR_HOST), $(IP_ADDRESS), *.yourdomainname.com, 192.168.178.*, (Replace *.cs.wisc.edu with your domain name and add your local IP subnet e.g. 192.168.178.*)
  10. Host with Administrator access * (or $(IP_ADDRESS))
  11. Enable VM Universe No
  12. Choose an installation directory and press next (e.g. C:\condor\)
  13. Press Install (You need Administrator rights.)
  14. Once installed, you have to restart the computer. Then your new pool or pool member should be up and running.
  15. To be able to submit jobs from nextnanomat to HTCondor, you have to store your credentials once. Open a command shell and type the following command: condor_store_cred add
    • Enter your password and you are ready to submit your first HTCondor job.
    • If this does not work, try to enter condor_store_cred add -debug for more output information on the error.
  16. In order to submit jobs from nextnanomat to HTCondor, you have to activate within nextnanomat: Tools → Options → Expert settings → Show nonworking and experimental features

Summary of settings (Example)

Hostname (for HTCondor pool): computername.yourcompanyname.com
Policy: "Always run jobs"
Accounting domain: yourcompanyname.com
Read access: *
Write access: $(CONDOR_HOST), $(IP_ADDRESS), *.yourcompanyname.com
Administrator: $(IP_ADDRESS)
Config file

You can find your condor config settings in the file C:\condor\condor_config. Let's look at an example.

  • Your company is called Simpson.
  • Your Windows domain is called simpson.com.
  • Your condor pool shall have the name TheSimpsonsCondorPool.
  • The condor host that manages the condor jobs has the computer name homer.simpson.com.
  • Your computer is called lisa.simpson.com.
  • The computers in your network have the IP range 192.168.188.*. (or 2001:db8:2042::* in IPv6)
RELEASE_DIR = C:\condor
LOCAL_CONFIG_FILE = $(LOCAL_DIR)\condor_config.local
REQUIRE_LOCAL_CONFIG_FILE = FALSE
LOCAL_CONFIG_DIR = $(LOCAL_DIR)\config
use SECURITY : HOST_BASED
#CONDOR_HOST: $(FULL_HOSTNAME)          # on computer called homer
CONDOR_HOST: homer                      # on computer called lisa
COLLECTOR_NAME = TheSimpsonsCondorPool  # only on computer called homer
#UID_DOMAIN =                           # empty if you do not have a domain
UID_DOMAIN = simpson.com
SOFT_UID_DOMAIN=TRUE             # entry is missing if you do not have a domain
FILESYSTEM_DOMAIN = simpson.com  # entry is missing if you do not have a domain
CONDOR_ADMIN = 
SMTP_SERVER = 
ALLOW_READ = *
ALLOW_WRITE = $(CONDOR_HOST), $(IP_ADDRESS), *.simpson.com, 192.168.188.*, 2001:db8:2042::*
ALLOW_ADMINISTRATOR = $(IP_ADDRESS)
use POLICY : ALWAYS_RUN_JOBS
#use POLICY : DESKTOP
WANT_VACATE = FALSE
WANT_SUSPEND = TRUE
#DAEMON_LIST = MASTER SCHEDD COLLECTOR NEGOTIATOR STARTD # on computer called homer
DAEMON_LIST = MASTER SCHEDD STARTD                       # on computer called lisa

Useful HTCondor commands for the Command Prompt

  • condor_submit <filename>.sub Submit a job to the pool.
  • condor_q Shows current state of own jobs in the queue.
    • condor_q -nobatch -global -allusers Shows state of all jobs in the cluster. Of all users.
    • condor_q -goodput -global -allusers Shows state and occupied CPU of all jobs in the cluster.
    • condor_q -allusers -global -analyze Detailed information for every job in the cluster.
    • condor_q -global -allusers -hold Shows why jobs are in hold state.
  • condor_status Shows state of all available resources.
  • condor_rm Remove jobs from a queue:
    • condor_rm -all Removes all jobs from a queue.
    • condor_rm <cluster>.<id> Removes jobs on cluster <cluster> with id <id> (It seems <cluster>. can be omitted, and id is the JOB_IDS number.)
  • condor_release -all If any jobs are in state hold, use this command to restart them.
  • condor_restart Restart all condor daemons/services after changes in config file.
  • condor_version Returns the version number of HTCondor
  • condor_store_cred query Returns info about the credentials stored for HTCondor jobs

FAQ

Q: I submitted a job to HTCondor, but the Batch line of nextnanomat is stuck with preparing. What is wrong?

A1: Did you store your credentials after the installation of HTCondor? If not, enter condor_store_cred add into the command prompt to add your password, see above (Recommended Installation Process).

A2: Did you change your password recently? If yes you have to reenter your credentials for HTCondor. Enter condor_store_cred add into the command prompt to add your password, see above (Recommended Installation Process). If this does not work, try to enter condor_store_cred add -debug for more output information on the error.

Option B: Amazon EC2 (aws)

(We are working on it.)

nnm/cloud_computing.txt · Last modified: 2018/10/12 11:37 by carola.burkl