Deploy OpenSARLab to an AWS account

A note about deployments: A deployment of OpenSARLab refers to a standalone instance of OpenSARLab. If you are setting up OpenSARLab for several classes and/or collaborative groups with disparate needs or funding sources, it may be useful to give them each their own standalone deployment. This separates user group authentication, simplifies billing for each group, and allows for easy cleanup at the end of a project or class (just delete the deployment). In the following instructions, replace any occurrence of "deployment_name" with the deployment name you have chosen.

Make your deployment name lowercase and use no special characters other than dashes (-). It will be used to generate part of the Cognito callback URL and CloudFormation stack names also follow the same naming convention.

Take AWS SES out of sandbox

The AWS Simple Email Service is used by OpenSARLab to send emails to users and administrators. These include authentication related notifications and storage lifecycle management messages.

While SES is in sandbox, you are limited to sending 1 email per second with no more than 200 in a 24 hour period, and they may only be sent from an SES verified address to other SES verified addresses.

Note: Provide a detailed explanation of your SES use and email policies when applying to exit the sandbox or you will be denied.

Approval can take 24-48 hours

Follow these instructions to take your SES out of sandbox.

Create an AWS Cost Allocation Tag

Note: only management accounts can create cost allocation tags

Create a cost allocation tag or have one created by someone with access
1. Give it an available name that makes sense for tracking deployment names associated with AWS resources
  1. i.e. "deployment_name"

Add dockerhub credentials to AWS Secrets Manager

This deployment uses a few publicly available docker images. Due to dockerhub rate limits (https://www.docker.com/increase-rate-limits), you will need to set up a dockerhub account. A free-tier account will suffice. CodePipeline's ip address is shared by many users and you will likely hit the rate limit as an anonymous user (details here).

Note: By default this secret will be used for multiple deployments. Optionally, you could edit the codebuild section in the cf-cluster.yml to point to a different secret.

If you don't have a dockerhub account, create one here
Open the AWS Secrets Manager console
Click the "Store a new secret" button
1. Page 1:
  1. Select "Other type of secrets"
  2. Select the "Plaintext" tab
  3. Delete the default content
  4. Add your username and password, separated by a space
    1. Example: username password
  5. Click the "Next" button
2. Page 2:
  1. Secret name
    1. dockerhub/creds
    2. Click the "Next" button
3. Page 3:
  1. Click the "Next" button
4. Page 4:
  1. Click the "Store" button

Setup an iCal calendar for notifications

Notifications are generated from iCal calendar events. ASF uses Google Calendar but any publicly accessible iCal formatted calendar should work as well

Create a public iCal formatted calendar
The iCal formatted url will be needed in later
Notification calendar events must be properly formatted.
1. Formatting details available in the Take care of odds and ends section

Store your CA certificate

OpenSARLab will lack full functionality if not using https (SSL certification)

Follow these instructions to import your CA certificate into the AWS Certificate Manager

Prepare CodeCommit Repos

TODO Do this differently

All the public OpenSARlab repos are in the ASFOpenSARlab Github Org

Create a deployment_name-container CodeCommit repo in your AWS account
Create a deployment_name-cluster CodeCommit repo
Clone the deployment_name-container and deployment_name-cluster repos to your local computer using ssh
cd into your local deployment_name-container repo
1. add ASFOpenSARlab/opensarlab-container as a remote on your local deployment_name-container repo
  1. git remote add github https://github.com/ASFOpenSARlab/opensarlab-container.git
2. Pull the remote opensarlab-container repo into your local deployment_name-container repo
  1. git pull github main
3. Create a main branch in the deployment_name-container repo
  1. git checkout -b main
4. Push to the remote deployment_name-container repo
  1. git push origin main
cd into your local deployment_name-cluster repo
1. add ASFOpenSARlab/opensarlab-cluster as a remote on your local deployment_name-cluster repo
  1. git remote add github https://github.com/ASFOpenSARlab/opensarlab-cluster.git
2. Pull the remote opensarlab-cluster repo into your local deployment_name-cluster repo
  1. git pull github main
3. Create a main branch in the deployment_name-cluster repo
  1. git checkout -b main
4. Push to the remote deployment_name-cluster repo
  1. git push origin main

You should now have container and cluster repos in CodeCommit that are duplicates of those found in ASFOpenSARlab

Customize opensarlab_container code for deployment

The opensarlab-container repo contains one example image named helloworld, which you can reference when creating new images. Images can be used by multiple profiles

Note: It is easiest to work in your local repo and push your changes when you're done.

Duplicate the images/sar directory and rename it, using your chosen image name
The image name must be alpha-numeric with no whitespaces or special characters
Edit the dockerfile
1. Adjust the packages in the 2nd apt install command to suit your image needs
2. Add any pip packages you wish installed in the base conda environment
3. Add any conda packages you wish installed in the base conda environment
4. Create any conda environments you would like pre-installed before "USER jovyan"
  1. If using environment.yml files, store them in an "envs" directory in
    /jupyter-hooks, and they will be copied into the container
    1. RUN conda env create -f /etc/jupyter-hooks/envs/_env.yml --prefix /etc/jupyter-hooks/envs/
5. Run any tests for this image that you added to the tests directory under FROM release as testing
Remove the images/sar directory and sar.sh test script, unless you plan to use the sar image
Add a test script for your image
1. use sar.sh as an example
2. name it .sh
Add, commit, and push changes to the remote CodeCommit repo

Customize opensarlab_cluster code for deployment

Create and add any additional custom jupyter magic commands to the opensarlab/jupyterhub/singleuser/custom_magics directory Add any additional scripts you may have created for use in your image to the opensarlab/jupyterhub/singleuser/hooks directory
Duplicate opensarlab/jupyterhub/singleuser/hooks/sar.sh, renaming it after your image name
Edit opensarlab/jupyterhub/singleuser/hooks/<image_name>.sh
1. Copy any additional custom Jupyter magic scripts to $HOME/.ipython/image_default/startup/ (alongside 00-df.py)
2. Edit the repos being pulled to suit your deployment and image needs
Rename opensarlab/opensarlab.example.yaml to opensarlab/opensarlab.yaml
Use the example notes in opensarlab/opensarlab.yaml to define the required and optional fields
Update opensarlab/jupyterhub/helm_config.yaml
1. singleuser
2. Add any needed extraFiles
3. hub
4. Add any needed extraFiles
Add, commit, and push changes to the remote CodeCommit repo

Build the container CloudFormation stack

This will create the hub image, images for each profile, and store them in namespaced ECR repos

Open CloudFormation in the AWS console
1. Click the "Create stack" button and select "With new resources (standard)"
  1. Page 1 : Create stack
    1. Under "Specify template", check "Upload a template file"
    2. Use the file chooser to select cf-container.py from your local branch of the deployment_name-container repo
    3. Click the "Next" button
  2. Page 2: Specify stack details
    1. Stack Name
    2. Use a recognizable name that makes sense for your deployment
    3. CodeCommitSourceRepo
      1. The CodeCommit repo holding the container code (deployment_name-container)
    4. CodeCommitSourceBranch
      1. The name of the production branch of the deployment_name-container CodeCommit repo
    5. CostTagKey
    6. The cost allocation key you registered for tracking deployment costs
    7. CostTagValue
      1. deployment_name
  3. Page 3: Configure stack options
    1. Tags:
      1. Key: Cost allocation tag
      2. Value: deployment_name
    2. Click the "Next" button
  4. Page 4: Review Stack Name
    1. Review and confirm correctness
    2. Check the box next to "I acknowledge that AWS CloudFormation might create IAM resources"
    3. Click the "Create Stack Button"
2. Monitor the stack build for errors and rollbacks
  1. The screen does not self-update
    1. Use the refresh buttons
  2. If the build fails and rolls back
    1. goto the CloudFormation stacks page
      1. select and delete the failed stack before correcting any errors and trying again

Build the cluster CloudFormation stack

This CloudFormation stack dynamically creates 3 additional stacks.

Open CloudFormation in the AWS console
1. Page 1 : Create stack
  1. Click the "Create stack" button and select "With new resources (standard)"
  2. Under "Specify template", check "Upload a template file"
  3. Use the file chooser to select opensarlab/pipeline/cf-pipeline.yaml from your local branch of the cluster repo
  4. Click the "Next" button
  5. Page 2: Specify stack details
    1. Stack Name
    2. Use a recognizable name that makes sense for your deployment. Do not use a stack name that ends in cluster, jupyterhub, or cognito. These are reserved.
    3. CodeCommitRepoName
      1. The CodeCommit repo holding the container code (deployment_name-cluster)
    4. CodeCommitBranchName
      1. The name of the production branch of the deployment_name-cluster CodeCommit repo
    5. CostTagKey
    6. The cost allocation key you registered for tracking deployment costs
    7. CostTagValue
      1. deployment_name
2. Page 3: Configure stack options
  1. Tags:
    1. Key: Cost allocation tag
    2. Value: deployment_name
  2. Click the "Next" button
3. Page 4: Review Stack name
  1. Review and confirm correctness
  2. Check the box next to "I acknowledge that AWS CloudFormation might create IAM resources"
  3. Click the "Create Stack" button

Take care of odds and ends

Update deployment_url in the cluster repo opensarlab/opensarlab.yaml if you started off using load balancer
1. Don't forget to update your DNS record
Add the cost allocation tag to the EKS cluster
1. Navigate to the AWS EKS console
2. click the "Clusters" link in the sidebar menu
  1. Click on cluster stack
    1. Click the "Tags" tab
      1. Click the "Manage tags" button
        
        Click the "Add tag" button
        
        Key: Cost allocation tag
        
        Value: deployment_name
Prime the Auto Scaling Group for each profile unless there are active users
1. Navigate to the AWS EC2 console
  1. Select the "Auto Scaling Groups" sidebar link
    1. Select an autoscaling group
      1. Group details:
        
        Click the "Edit" button
        
        Desired capacity:
        
        Set to 1
        
        Click the "Update" button
Create a test notification
1. Navigate to your notification calendar
2. Create an event
  1. Set the event to last as long as you wish the notification to display
  2. The event title will appear as the notification title
  3. The description includes a metadata and message section
    1. Example:
      1. ``` profile: MY PROFILE, OTHER PROFILE type: info
      
      This is a notification 1. \<meta\> 1. profile: 1. Holds the name or names (comma separated) of the profiles where the notification will be displayed 1. type: 1. info 1. blue notification 1. success 1. green notification 1. warning 1. yellow notification 1. error 1. red notification 1. \<message\> 1. Your notification message 1. Sign up with your `admin_user_name` account, sign in, and add groups for each profile and sudo 1. Open the `deployment_url` in a web browser 1. Click the "Sign in" button 1. Click the "Sign up" link 1. Username: 1. The name used for the `admin_user_name` parameter of the `opensarlab.yaml` 1. Name: 1. Your name 1. Email: 1. Enter the email address used for the AdminEmailAddress parameter in the `deployment_name`-auth CloudFormation stack 1. Password: 1. A password 1. Click the "Sign up" button 1. Verification Code: 1. The verification code sent to your email address 1. Click the "Confirm Account" button 1. Add a group for each profile and for sudo 1. After confirming your account you should be redirected to the Server Options page 1. Click the "Groups" link at the top of the screen 1. Click the "Add New Group" button 1. Group Name: 1. The group name as it appears in the helm_config.yaml group_list 1. Note that this is not the display name and it contains underscores 1. Group Description: 1. (optional) Enter a group description 1. Group Type: 1. check "action" 1. This has no effect, but is useful for tracking user groups vs. profile groups 1. All Users?: 1. Check if you wish the profile to be accessible to all users 1. Is Enabled?: 1. check the box 1. Click the "Add Group" button 1. Repeat for all profiles 1. Repeat for a group named "sudo" 1. Do not enable sudo for all users! 1. This is useful for developers but avoid giving root privileges to regular users 1. Click the "Home" link at the top of the screen 1. Start up and test each profile 1. Click the "Start My Server" button 1. Select a profile 1. Click the "Start" button 1. Confirm that the profile runs as expected 1. Test notebooks as needed 1. Confirm that notifications appear 1. Repeat for each profile 1. Configure your local K8s config so you can manage your EKS cluster with kubectl 1. Add your AWS user to the trust relationship of the `deployment_name`-cluster-access IAM role 1. Navigate to the AWS IAM console 1. Click the "Roles" link from the sidebar menu 1. Select the `deployment_name`-cluster-access IAM role 1. Click the "Trust relationships" tab 1. Click the "Edit trust relationship" button 1. Add your AWS user ARN 1. Example json: 1.json { "Version": "2008-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam:::user/" ] }, "Action": "sts:AssumeRole" } ] } 1. Click the "Update Trust Policy" button 1. Add an AWS profile on your local machine 1. Example profile: 1.yaml [profile profile_name] source_profile = your_source_profile region = your_region role_arn = arn:aws:iam:::role/--cluster-user-access cluster_name = -cluster
``` 1. Run the helps/get_eks_kubeconfig.sh script in the opensarlab-cluster repo 1. Note: you will use this a lot and it may be helpful to create an alias in ~/.bash_aliases 1. Use kubectl