Deploy OpenSARLab to an AWS account
A note about deployments: A deployment of OpenSARLab refers to a standalone instance of OpenSARLab.
If you are setting up OpenSARLab for several classes and/or collaborative groups with disparate needs or funding sources,
it may be useful to give them each their own standalone deployment. This separates user group authentication,
simplifies billing for each group, and allows for easy cleanup at the end of a project or class (just delete the deployment).
In the following instructions, replace any occurrence of "deployment_name
" with the deployment name you have chosen.
Make your deployment name lowercase and use no special characters other than dashes (-). It will be used to generate part of the Cognito callback URL and CloudFormation stack names also follow the same naming convention.
Take AWS SES out of sandbox
The AWS Simple Email Service is used by OpenSARLab to send emails to users and administrators. These include authentication related notifications and storage lifecycle management messages.
While SES is in sandbox, you are limited to sending 1 email per second with no more than 200 in a 24 hour period, and they may only be sent from an SES verified address to other SES verified addresses.
Note: Provide a detailed explanation of your SES use and email policies when applying to exit the sandbox or you will be denied.
Approval can take 24-48 hours
- Follow these instructions to take your SES out of sandbox.
Create an AWS Cost Allocation Tag
Note: only management accounts can create cost allocation tags
- Create a cost allocation tag or have one created by someone with access
- Give it an available name that makes sense for tracking deployment names associated with AWS resources
- i.e. "deployment_name"
- Give it an available name that makes sense for tracking deployment names associated with AWS resources
Add dockerhub credentials to AWS Secrets Manager
This deployment uses a few publicly available docker images. Due to dockerhub rate limits (https://www.docker.com/increase-rate-limits), you will need to set up a dockerhub account. A free-tier account will suffice. CodePipeline's ip address is shared by many users and you will likely hit the rate limit as an anonymous user (details here).
Note: By default this secret will be used for multiple deployments. Optionally, you could edit the codebuild section in the cf-cluster.yml to point to a different secret.
- If you don't have a dockerhub account, create one here
- Open the AWS Secrets Manager console
- Click the "Store a new secret" button
- Page 1:
- Select "Other type of secrets"
- Select the "Plaintext" tab
- Delete the default content
- Add your username and password, separated by a space
- Example:
username password
- Example:
- Click the "Next" button
- Page 2:
- Secret name
dockerhub/creds
- Click the "Next" button
- Secret name
- Page 3:
- Click the "Next" button
- Page 4:
- Click the "Store" button
- Page 1:
Setup an iCal calendar for notifications
Notifications are generated from iCal calendar events. ASF uses Google Calendar but any publicly accessible iCal formatted calendar should work as well
- Create a public iCal formatted calendar
- The iCal formatted url will be needed in later
- Notification calendar events must be properly formatted.
- Formatting details available in the Take care of odds and ends section
Store your CA certificate
OpenSARLab will lack full functionality if not using https (SSL certification)
- Follow these instructions to import your CA certificate into the AWS Certificate Manager
Prepare CodeCommit Repos
TODO Do this differently
All the public OpenSARlab repos are in the ASFOpenSARlab Github Org
- Create a
deployment_name
-container CodeCommit repo in your AWS account - Create a
deployment_name
-cluster CodeCommit repo - Clone the
deployment_name
-container anddeployment_name
-cluster repos to your local computer using ssh - cd into your local
deployment_name
-container repo- add ASFOpenSARlab/opensarlab-container as a remote on your local
deployment_name
-container repogit remote add github https://github.com/ASFOpenSARlab/opensarlab-container.git
- Pull the remote opensarlab-container repo into your local
deployment_name
-container repogit pull github main
- Create a main branch in the
deployment_name
-container repogit checkout -b main
- Push to the remote
deployment_name
-container repogit push origin main
- add ASFOpenSARlab/opensarlab-container as a remote on your local
- cd into your local
deployment_name
-cluster repo- add ASFOpenSARlab/opensarlab-cluster as a remote on your local
deployment_name
-cluster repogit remote add github https://github.com/ASFOpenSARlab/opensarlab-cluster.git
- Pull the remote opensarlab-cluster repo into your local
deployment_name
-cluster repogit pull github main
- Create a main branch in the
deployment_name
-cluster repogit checkout -b main
- Push to the remote
deployment_name
-cluster repogit push origin main
- add ASFOpenSARlab/opensarlab-cluster as a remote on your local
You should now have container and cluster repos in CodeCommit that are duplicates of those found in ASFOpenSARlab
Customize opensarlab_container code for deployment
The opensarlab-container repo contains one example image named helloworld
, which you can reference when creating new images.
Images can be used by multiple profiles
Note: It is easiest to work in your local repo and push your changes when you're done.
- Duplicate the
images/sar
directory and rename it, using your chosen image name - The image name must be alpha-numeric with no whitespaces or special characters
- Edit the dockerfile
- Adjust the packages in the 2nd apt install command to suit your image needs
- Add any pip packages you wish installed in the base conda environment
- Add any conda packages you wish installed in the base conda environment
- Create any conda environments you would like pre-installed before "USER jovyan"
- If using environment.yml files, store them in an "envs" directory in
/jupyter-hooks, and they will be copied into the container - RUN conda env create -f /etc/jupyter-hooks/envs/
_env.yml --prefix /etc/jupyter-hooks/envs/
- RUN conda env create -f /etc/jupyter-hooks/envs/
- If using environment.yml files, store them in an "envs" directory in
- Run any tests for this image that you added to the tests directory under
FROM release as testing
- Remove the images/sar directory and sar.sh test script, unless you plan to use the sar image
- Add a test script for your image
- use sar.sh as an example
- name it
.sh
- Add, commit, and push changes to the remote CodeCommit repo
Customize opensarlab_cluster code for deployment
- Create and add any additional custom jupyter magic commands to the
opensarlab/jupyterhub/singleuser/custom_magics
directory Add any additional scripts you may have created for use in your image to theopensarlab/jupyterhub/singleuser/hooks
directory - Duplicate
opensarlab/jupyterhub/singleuser/hooks/sar.sh
, renaming it after your image name - Edit
opensarlab/jupyterhub/singleuser/hooks/<image_name>.sh
- Copy any additional custom Jupyter magic scripts to
$HOME/.ipython/image_default/startup/
(alongside 00-df.py) - Edit the repos being pulled to suit your deployment and image needs
- Copy any additional custom Jupyter magic scripts to
- Rename
opensarlab/opensarlab.example.yaml
toopensarlab/opensarlab.yaml
- Use the example notes in
opensarlab/opensarlab.yaml
to define the required and optional fields - Update
opensarlab/jupyterhub/helm_config.yaml
singleuser
- Add any needed extraFiles
hub
- Add any needed extraFiles
- Add, commit, and push changes to the remote CodeCommit repo
Build the container CloudFormation stack
This will create the hub image, images for each profile, and store them in namespaced ECR repos
- Open CloudFormation in the AWS console
- Click the "Create stack" button and select "With new resources (standard)"
- Page 1 : Create stack
- Under "Specify template", check "Upload a template file"
- Use the file chooser to select cf-container.py from your local branch of the
deployment_name
-container repo - Click the "Next" button
- Page 2: Specify stack details
Stack Name
- Use a recognizable name that makes sense for your deployment
CodeCommitSourceRepo
- The CodeCommit repo holding the container code (
deployment_name
-container)
- The CodeCommit repo holding the container code (
CodeCommitSourceBranch
- The name of the production branch of the
deployment_name
-container CodeCommit repo
- The name of the production branch of the
CostTagKey
- The cost allocation key you registered for tracking deployment costs
CostTagValue
deployment_name
- Page 3: Configure stack options
- Tags:
- Key: Cost allocation tag
- Value:
deployment_name
- Click the "Next" button
- Tags:
- Page 4: Review
Stack Name
- Review and confirm correctness
- Check the box next to "I acknowledge that AWS CloudFormation might create IAM resources"
- Click the "Create Stack Button"
- Page 1 : Create stack
- Monitor the stack build for errors and rollbacks
- The screen does not self-update
- Use the refresh buttons
- If the build fails and rolls back
- goto the CloudFormation stacks page
- select and delete the failed stack before correcting any errors and trying again
- goto the CloudFormation stacks page
- The screen does not self-update
- Click the "Create stack" button and select "With new resources (standard)"
Build the cluster CloudFormation stack
This CloudFormation stack dynamically creates 3 additional stacks.
- Open CloudFormation in the AWS console
- Page 1 : Create stack
- Click the "Create stack" button and select "With new resources (standard)"
- Under "Specify template", check "Upload a template file"
- Use the file chooser to select
opensarlab/pipeline/cf-pipeline.yaml
from your local branch of the cluster repo - Click the "Next" button
- Page 2: Specify stack details
Stack Name
- Use a recognizable name that makes sense for your deployment. Do not use a stack name that ends in
cluster
,jupyterhub
, orcognito
. These are reserved. CodeCommitRepoName
- The CodeCommit repo holding the container code (
deployment_name
-cluster)
- The CodeCommit repo holding the container code (
CodeCommitBranchName
- The name of the production branch of the
deployment_name
-cluster CodeCommit repo
- The name of the production branch of the
CostTagKey
- The cost allocation key you registered for tracking deployment costs
CostTagValue
deployment_name
- Page 3: Configure stack options
- Tags:
- Key: Cost allocation tag
- Value:
deployment_name
- Click the "Next" button
- Tags:
- Page 4: Review
Stack name
- Review and confirm correctness
- Check the box next to "I acknowledge that AWS CloudFormation might create IAM resources"
- Click the "Create Stack" button
- Page 1 : Create stack
Take care of odds and ends
- Update
deployment_url
in the cluster repoopensarlab/opensarlab.yaml
if you started off usingload balancer
- Don't forget to update your DNS record
- Add the cost allocation tag to the EKS cluster
- Navigate to the AWS EKS console
- click the "Clusters" link in the sidebar menu
- Click on cluster stack
- Click the "Tags" tab
- Click the "Manage tags" button
- Click the "Add tag" button
- Key: Cost allocation tag
- Value:
deployment_name
- Click the "Add tag" button
- Click the "Manage tags" button
- Click the "Tags" tab
- Click on cluster stack
- Prime the Auto Scaling Group for each profile unless there are active users
- Navigate to the AWS EC2 console
- Select the "Auto Scaling Groups" sidebar link
- Select an autoscaling group
- Group details:
- Click the "Edit" button
- Desired capacity:
- Set to 1
- Click the "Update" button
- Desired capacity:
- Click the "Edit" button
- Group details:
- Select an autoscaling group
- Select the "Auto Scaling Groups" sidebar link
- Navigate to the AWS EC2 console
- Create a test notification
- Navigate to your notification calendar
- Create an event
- Set the event to last as long as you wish the notification to display
- The event title will appear as the notification title
- The description includes a metadata and message section
- Example:
-
``` profile: MY PROFILE, OTHER PROFILE type: info
This is a notification 1. \<meta\> 1. profile: 1. Holds the name or names (comma separated) of the profiles where the notification will be displayed 1. type: 1. info 1. blue notification 1. success 1. green notification 1. warning 1. yellow notification 1. error 1. red notification 1. \<message\> 1. Your notification message 1. Sign up with your `admin_user_name` account, sign in, and add groups for each profile and sudo 1. Open the `deployment_url` in a web browser 1. Click the "Sign in" button 1. Click the "Sign up" link 1. Username: 1. The name used for the `admin_user_name` parameter of the `opensarlab.yaml` 1. Name: 1. Your name 1. Email: 1. Enter the email address used for the AdminEmailAddress parameter in the `deployment_name`-auth CloudFormation stack 1. Password: 1. A password 1. Click the "Sign up" button 1. Verification Code: 1. The verification code sent to your email address 1. Click the "Confirm Account" button 1. Add a group for each profile and for sudo 1. After confirming your account you should be redirected to the Server Options page 1. Click the "Groups" link at the top of the screen 1. Click the "Add New Group" button 1. Group Name: 1. The group name as it appears in the helm_config.yaml group_list 1. Note that this is not the display name and it contains underscores 1. Group Description: 1. (optional) Enter a group description 1. Group Type: 1. check "action" 1. This has no effect, but is useful for tracking user groups vs. profile groups 1. All Users?: 1. Check if you wish the profile to be accessible to all users 1. Is Enabled?: 1. check the box 1. Click the "Add Group" button 1. Repeat for all profiles 1. Repeat for a group named "sudo" 1. Do not enable sudo for all users! 1. This is useful for developers but avoid giving root privileges to regular users 1. Click the "Home" link at the top of the screen 1. Start up and test each profile 1. Click the "Start My Server" button 1. Select a profile 1. Click the "Start" button 1. Confirm that the profile runs as expected 1. Test notebooks as needed 1. Confirm that notifications appear 1. Repeat for each profile 1. Configure your local K8s config so you can manage your EKS cluster with kubectl 1. Add your AWS user to the trust relationship of the `deployment_name`-cluster-access IAM role 1. Navigate to the AWS IAM console 1. Click the "Roles" link from the sidebar menu 1. Select the `deployment_name`-cluster-access IAM role 1. Click the "Trust relationships" tab 1. Click the "Edit trust relationship" button 1. Add your AWS user ARN 1. Example json: 1.
json { "Version": "2008-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam:::user/ " ] }, "Action": "sts:AssumeRole" } ] } 1. Click the "Update Trust Policy" button 1. Add an AWS profile on your local machine 1. Example profile: 1.
yaml [profile profile_name] source_profile = your_source_profile region = your_region role_arn = arn:aws:iam:::role/ - -cluster-user-access cluster_name = -cluster
``` 1. Run the helps/get_eks_kubeconfig.sh script in the opensarlab-cluster repo 1. Note: you will use this a lot and it may be helpful to create an alias in ~/.bash_aliases 1. Use kubectl
-
- Example: