Bare metal¶
JupyterHub can be deployed in bare metal systems. This is the way it was done by the Gravity Exploration Institute at Cardiff University to make a set of Python environments (3.6, 3.7 and 3.8) available to its users.
The main difference with a traditional JupyterHub installation is the use of conda to install a full environment rather than the recommended combination of pip/conda.
The main steps are:
Install JupyterHub dependencies:
Nodejs
npm
systemd
httpd
miniconda
Create JupyterHub group for sudospawner
Create JupyterHub user
Create a sudoers file
Create a configuration file
Create an anaconda virtual environment file for JupyterHub.
Create a sudospawner configuration file
Create JupyterHub’s systemd service file
Create JupyterHub’s httpd configuration file
Create JupyterHub’s static kernels
Provision the kernels with the required environments
Create a suitable script to start JupyterHub
Optionally create a suitable logo to display in JupyterHub
Start the JupyterHub service
There is an ansible script available to try to automatize this process.
Installing Anaconda¶
The first step is making sure that conda>4.8.3 is available in the system or download and install otherwise:
miniconda/tasks/main.yml:
---
- name: check for existing miniconda
stat:
path: "{{ miniconda_conda_bin }}"
changed_when: false
register: miniconda_conda
- name: get installed miniconda version
command: "{{ miniconda_conda_bin }} --version"
changed_when: false
register: installed_conda_version
when: miniconda_conda.stat.exists
- name: check installed miniconda version
set_fact:
installed_conda_version: "{{ installed_conda_version.stdout | regex_search(version_output, '\\1') | first }}"
vars:
version_output: 'conda (.+)'
when: installed_conda_version.stdout is defined
# install miniconda
- when: not miniconda_conda.stat.exists or installed_conda_version < miniconda_min_version
block:
- name: download miniconda installer
get_url:
url: "{{ miniconda_installer_url }}"
dest: "/tmp/{{ miniconda_installer_url | basename }}"
mode: "0755"
- name: install miniconda
command: "bash /tmp/{{ miniconda_installer_url | basename }} -b -p {{ miniconda_prefix }}"
become: true
args:
creates: "{{ miniconda_prefix }}"
- name: delete miniconda installer
file:
path: "/tmp/{{ miniconda_installer_url | basename }}"
state: absent
Where the different variables are defined in defaults and vars:
miniconda/defaults/main.yml
miniconda_prefix: /opt/miniconda3
miniconda_min_version: "4.8.3"
miniconda_installer_url: https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
miniconda/vars/main.yml
---
miniconda_conda_bin: "{{ miniconda_prefix }}/condabin/conda"
Installing other dependencies¶
Depending on the Linux distribution used, the way to install the rest of dependencies may vary. This guide has been tested in Centos 7.
In this step we install (in case they are not already available):
nodejs
npm
systemd
httpd
In the case of nodejs, it might be necessary to first install Node repository. We can do this by downloading the installers and running them locally. We can use the following ansible role to perform the installation:
node/tasks/main.yml
---
- name: download nodejs repo installer
get_url:
url: "https://rpm.nodesource.com/setup_15.x"
dest: "/tmp/nodejsrepo"
mode: "0755"
- name: Install nodejs repository
command: "bash /tmp/nodejsrepo"
become: true
# npm is installed as part of nodejs
- name: install jupyterhub deps
yum:
name:
- nodejs
- systemd
- httpd
state: installed
become: true
Authentication¶
For demonstration purposes we show how to configure via PAM (Ligo uses Shibboleth for authentication purposes). The authentication method to use can be defined in JupyterHub configuration file:
jupyterhub/templates/jupyterhub_config.py.j2
c.Application.log_level = 10
# By commenting out the authenticator class
# jupyterhub falls back to using PAM
#c.JupyterHub.authenticator_class = 'jhub_remote_user_authenticator.remote_user_auth.RemoteUserLocalAuthenticator'
c.ConfigurableHTTPProxy.command = '/opt/jupyterhub/bin/configurable-http-proxy'
c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner'
c.SudoSpawner.sudospawner_path = '/opt/jupyterhub/bin/sudospawner'
# Make JupyterLab the default
c.Spawner.default_url = '/lab'
c.PAMAuthenticator.open_sessions=False
As mentioned, by commenting out the authenticator class JupyterHub falls back to using PAM authentication method.
IRIS IAM¶
It is possible to authenticate using IRIS IAM. For this some changes and additional packages are required.
Dependencies: OAuth should be installed using pip. This can be added to the original conda environment definition:
jupyterhub/files/jupyterhub-environment-iris-iam.yml
name: jupyterhub channels: - conda-forge dependencies: - configurable-http-proxy - jupyterlab - pip - python=3.8 - sudospawner - pip: - jhub_remote_user_authenticator - "--editable=git+https://github.com/jupyterhub/oauthenticator.git@master"
Client registration: JupyterHub needs to be registered as a IAM client on IRIS IAM
From here follow the instructions in the INDIGO IAM documentation site. You will need to enter the public IP address and port of the server running JupyterHub in the Redirect URI(s) field.
Make sure to save the client credentials for your client as they will allow you to modify its settings later on and configure JupyterHub.
JupyterHub configuration: The GenericOAuthenticator method is used to interact with IRIS IAM and requires configuring a few settings including the client id and client secret provided in the previous step. We also need to provide the address to which the user will be redirected after successful authentication (this address needs match to one defined during the client registration step).
IAM authentication requires defining a few environment variables and make them visible by JupyterHub. In our current setup the JupyterHub user, jupyterhub, is a nologin non-interactive user and one way to define these environment variables is to add them in the JupyterHub configuration file using the os module.
Another thing to keep in mind is that our current setup requires the authenticated user to have a matching account in the system running JupyterHub. This is, if user “John” authenticates in IRIS IAM, JupyterHub’s spawner (sudospawner in our current case) will try to start a Jupyter server for user “John” in the local system failing if the user cannot be found.
It is further possible to filter users by their IAM group by using the allowed_groups option. For example, we can specify that only users part of the jupyterhub-da/stfccloud group to have access to our hub. NOTE Please note that allowed_groups, together with other options described in OAuthenticator website, are not yet supported in the latest (0.13.0 as of yet) OAuthenticator release, so we need to use the master branch in GitHub, this might change in future releases.
c.Application.log_level = 10 c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner' c.SudoSpawner.sudospawner_path = '/opt/jupyterhub/bin/sudospawner' # Make JupyterLab the default c.Spawner.default_url = '/lab' c.ConfigurableHTTPProxy.command = '/opt/jupyterhub/bin/configurable-http-proxy' # Authenticator import os import subprocess import sys os.environ['OAUTH2_AUTHORIZE_URL']='https://iris-iam.stfc.ac.uk/authorize' os.environ['OAUTH_CALLBACK_URL']='http://<JH-IP>:<PORT>/hub/oauth_callback' os.environ['OAUTH2_TOKEN_URL']='https://iris-iam.stfc.ac.uk/token' from oauthenticator.generic import GenericOAuthenticator c.JupyterHub.authenticator_class = GenericOAuthenticator c.GenericOAuthenticator.login_service = 'IRIS IAM' c.GenericOAuthenticator.client_id = '<COPY_FROM_IRIS_IAM_CLIENT_REGISTRATION>' c.GenericOAuthenticator.client_secret = '<COPY_FROM_IRIS_IAM_CLIENT_REGISTRATION>' c.GenericOAuthenticator.userdata_url = 'https://iris-iam.stfc.ac.uk/userinfo' c.GenericOAuthenticator.token_url = 'https://iris-iam.stfc.ac.uk/token' c.GenericOAuthenticator.userdata_method= 'GET' c.GenericOAuthenticator.userdata_params: {'state': 'state'} c.GenericOAuthenticator.username_key = 'preferred_username' c.GenericOAuthenticator.oauth_callback_url = 'http://<JH-IP>:<PORT>/hub/oauth_callback' c.GenericOAuthenticator.allowed_groups = ['jupyterhub-da/stfccloud']
After configuration, the user would navigate to the JupyterHub’s server address and be greeted by a message like:
And then the user should be redirected to IRIS IAM login website:
Our client needs to be approved by the user the first time it is used by that user. After authorization the user should be redirected to the Jupyter server spawned by the Hub.
Spawner¶
Ligo uses a Custom Spawners for JupyterHub (SudoSpawner) to start each single-user notebook server. This spawner enables JupyterHub to run without being root, by spawning an intermediate process via sudo. This seems like a sensible choice to improve system security. In JupyterHub configuration file this is controlled with sudospawner_path. Besides this, SudoSpawner requires setting up the user that will actually run the Hub and define which commands is it allowed to execute on behalf of users. This is done via a couple of configuration files:
A systemd configuration file for JupyterHub that defines the right user and location where jupyterhub command should be invoked:
jupyterhub/templates/jupyterhub.service.j2
[Unit]
Description=JupyterHub
Requires=firewalld.service
After=network-online.target
[Service]
User={{ jupyter_server_user }}
Environment="PATH={{ jupyterhub_prefix }}/bin:/sbin:/bin:/usr/sbin:/usr/bin"
ExecStart={{ jupyterhub_prefix }}/bin/jupyterhub
WorkingDirectory={{ jupyterhub_config_directory }}
Restart=on-failure
[Install]
WantedBy=multi-user.target
And a sudoers file that defines the command that JupyterHub’s user is allowed to execute. Users are allowed to spawn a Jupyter Notebook if they are member of a particular group (LIGO in Ligo’s case):
jupyterhub/templates/jupyterhub.sudofile.j2
# the command(s) the Hub can run on behalf of the above users without needing a
# password the exact path may differ, depending on how sudospawner was installed
Cmnd_Alias JUPYTER_CMD = {{ jupyterhub_prefix }}/bin/sudospawner
# actually give the Hub user permission to run the above command on behalf
# of the above users without prompting for a password
{{ jupyter_server_user }} ALL=(%{{ jupyter_server_sudo_group }}) NOPASSWD:JUPYTER_CMD
Defaults¶
It is useful to define default values for some of the parameters used in our configuration files. Being stored in a separate file might facilitate to adapt these templates for different cases.
jupyterhub/defaults/main.yml
---
igwn_conda_root: "/cvmfs/oasis.opensciencegrid.org/ligo/sw/conda"
igwn_env_name:
- igwn-py36
- igwn-py36-proposed
- igwn-py36-testing
- igwn-py37
- igwn-py37-proposed
- igwn-py37-testing
- igwn-py38
- igwn-py38-proposed
- igwn-py38-testing
jupyter_server_user: jupyterhub
jupyter_server_sudo_group: LIGO
jupyterhub_config_directory: "/etc/jupyterhub"
jupyterhub_log_directory: "/var/log/jupyterhub"
jupyterhub_datadir: "/usr/local/share/jupyter"
jupyterhub_prefix: "/opt/jupyterhub"
...
Main tasks¶
This script runs the main tasks required to deploy our JupyterHub service.
jupyterhub/defaults/main.yml
---
# CREATE CONDA ENVIRONMENT
- name: create temporary directory
tempfile:
state: directory
suffix: jupyterhub
register: jupyterhub_tempdir
- name: copy jupyterhub environment file
copy:
src: jupyterhub-environment.yml
dest: "{{ jupyterhub_tempdir.path }}/environment.yml"
owner: root
group: root
mode: "0644"
register: jupyterhub_environment_yaml
- name: check jupyterhub environment exists
stat:
path: "{{ jupyterhub_prefix }}"
register: jupyterhub_environment
- name: create jupyterhub environment
command: "{{ miniconda_conda_bin }} env create
--file {{ jupyterhub_environment_yaml.dest }}
--prefix {{ jupyterhub_prefix }}
--quiet"
when: not jupyterhub_environment.stat.exists
- name: update jupyterhub environment
command: "{{ miniconda_conda_bin }} env update
--file {{ jupyterhub_environment_yaml.dest }}
--prefix {{ jupyterhub_prefix }}
--quiet"
when: jupyterhub_environment.stat.exists
- name: delete tempdir
file:
path: "{{ jupyterhub_tempdir.path }}"
state: absent
# END CREATE CONDA ENVIRONMENT
# SUDOSPAWNER SETUP
# seems like this is not really needed because sudospawner is installed as part
# of conda and everybody as permissions to execute it??
- name: create jupyterhub group for sudospawner
group:
name: "{{ jupyter_server_sudo_group }}"
state: present
- name: create jupyterhub user
user:
name: "{{ jupyter_server_user }}"
comment: "jupyterhub server user"
system: "yes"
state: present
createhome: "no"
shell: /sbin/nologin
groups: "{{ jupyter_server_sudo_group }}"
- name: copy jupyterhub sudoers file
template:
src: jupyterhub.sudofile.j2
dest: /etc/sudoers.d/jupyterhub
owner: root
group: root
mode: "0644"
validate: /usr/sbin/visudo -cf %s
- name: copy Jupyter sudospawner config file over
template:
src: sudospawner-singleuser.j2
dest: "{{ jupyterhub_prefix }}/bin/sudospawner-singleuser"
owner: root
group: root
mode: "0755"
## END SUDOSPAWNER SETUP
## JHUB CONFIGURATION
- name: create jupyterhub config directory
file:
path: "{{ jupyterhub_config_directory }}"
state: directory
owner: root
group: "{{ jupyter_server_sudo_group }}"
mode: "0775"
- name: copy jupyterhub config file over
template:
src: jupyterhub_config.py.j2
dest: "{{ jupyterhub_config_directory }}/jupyterhub_config.py"
owner: "{{ jupyter_server_user }}"
group: root
mode: "0644"
- name: copy systemd service file over
template:
src: jupyterhub.service.j2
dest: /etc/systemd/system/jupyterhub.service
owner: root
group: root
mode: "0644"
## END JHUB CONFIGURATION
- name: copy static jupyter kernels
copy:
src: kernels
dest: "{{ jupyterhub_datadir }}"
owner: root
group: root
mode: preserve
- name: provision IGWN jupyter kernels
file:
path: "{{ jupyterhub_datadir }}/kernels/{{ item }}"
state: directory
owner: root
group: root
mode: "0755"
with_items:
- "{{ igwn_env_name }}"
- name: create IGWN jupyter kernel.json
template:
src: kernel.json.j2
dest: "{{ jupyterhub_datadir }}/kernels/{{ item }}/kernel.json"
with_items:
- "{{ igwn_env_name }}"
- name: create IGWN jupyter start.sh
template:
src: start.sh.j2
dest: "{{ jupyterhub_datadir }}/kernels/{{ item }}/start.sh"
with_items:
- "{{ igwn_env_name }}"
- name: create IGWN jupyter logo
copy:
src: igwn-logo-64x64.png
dest: "{{ jupyterhub_datadir }}/kernels/{{ item }}/logo-64x64.png"
with_items:
- "{{ igwn_env_name }}"
- name: start jupyterhub service
service:
name: jupyterhub
state: started
enabled: yes
Running deployment¶
The files above can be run with an ansible playbook
jhub.yml
#- hosts: all
- hosts: 127.0.0.1
roles:
- miniconda
- node
- { role: jupyterhub, become: yes }