Table of Contents

How to create JH on AWS

The environment variable $KUBECONFIG points to the file that kubectl reads to access clusters; it contains encryption keys, user info, etc; by default, tools look in ~/.kube/kubeconfig, but we've set it to /datascience/keys/kube/config in your .bashrc file. This file contains the following:

export PATH=$PATH:$HOME/bin

# for some small local installs/scripts
export PATH="/datascience/local/bin:$PATH"
# point kubectl at this config file (which stores cluster information and credentials; default is ~/.kube/config)
export KUBECONFIG="/datascience/keys/kube/config"
# k is so much easier to type...
alias k=kubectl
alias kx=kubectx
# AWS profile
export AWS_PROFILE=default
# for krew, a kubectl plugin (installed via website directions https://krew.sigs.k8s.io/docs/user-guide/setup/install/)
export PATH="${KREW_ROOT:-$HOME/.krew}/bin:$PATH"

kubectl is aliased to k. k config is a command for working with clusters, actually “contexts” which store the current cluster, user, and namespace being looked at; k config get-contexts lists the current contexts shown (this command is frustratingly different from most k commands where get is a separate word); the second column shows the context 'name'

You can switch to a context with k config use-context <context-name>. After that everything shown will be in that particular cluster since we're using a shared config setup, it's worth noting which context is active since if one of us modifies it the other sees it as well To create a new cluster on AWS we are using the eksctl utility, which is an AWS-specific tool for spinning up EKS clusters and managing them. It can do pretty nifty stuff, such as replacing a nodegroup while the cluster is running (if you want to update the settings for it or something, like the instance sizes).
eksctl uses YAML config files to define clusters, the ones currently running are in /datascience/dsosuk8s/cluster/eksctl the README.md file in that directory is pretty complete markdown render on github: https://github.com/datasci-osu/dsosuk8s/blob/master/cluster/eksctl/README.md so, to create a new cluster:

  1. Copy the latest-and-greatest (currently hub-green.yaml ) to another one ( dev-green.yaml for an example)
  2. Adjust the cluster “name”, and the publicKeyPaths should now be /datascience/keys/sshkeys/eksctl_id_rsa.pub (there are several instances in the file, one for each nodegroup - these are the keys to ssh to nodes if necessary)
  3. create the cluster with eksctl create cluster -f cluster-name.yaml

Spinning up a new cluster on AWS takes a long time, about 20-30 minutes. Once it is ready you can rename the context. The default name is something long and obnoxious. k config get-contexts will show the contexts including the new one, then k config rename-context <current-name> <new-name> to rename it to something reasonable.

Configure the cluster

Next up is getting the cluster tooling in place reverse proxy “ingress”, autoscaling client, etc. you can copy the /datascience/deployments/hub-green.datasci.oregonstate.edu folder to /datascience/deployments/<new-name>.datasci.oregonstate.edu. This naming convention isn't anything special.

You'll need to edit 01-08*.yaml to reference the new kubeContext (same name as set with k config rename-context) and masterHost (the eventual CNAME), run them in order and let each finish; note the adminPassword: in 06-grafana.yaml defaults to admin.

When it's up and going, you can find out the host to point the new CNAME with:

k get svc master-ingress-nginx-ingress -n cluster-tools

or start with:

k ns cluster-tools

To select that namespace and do a k get all to see the various pieces living there including the loadbalancer assuming that all worked out, the usage dashboard should live at https://cluster-name.datasci.oregonstate.edu/grafana/ login admin / admin by default.

There is a work-in-progress dashboard for the hub; to install it click the “+” icon in grafana and select “import”, paste the JSON from /datascience/dsosuk8s/cluster/grafana_prometheus/jhub_cluster_metrics.json and it should create it. it's also possible to put in the numeric id of a dashboard from https://grafana.com/grafana/dashboards if you want to play around; grafana is a GUI for displaying metrics served up by (amongst other databases) prometheus, a logging database server and that should get you to the point of creating a hub, based on one of the examples (e.g. cj-test.yaml - don't forget to change the kubeContext and clusterHostname variables to match the new cluster. The yaml files are executable with /usr/bin/env -S, the first line in 03-registry.yaml for example is

#!/usr/bin/env -S helm kush upgrade registry/datascience/dsosuk8s/charts/docker-registry –install –kush-interpolate –values where the contents of the file are taken as the param for the last –values

Instead of listing kubeContext in this and various other files, we should be able to have a file like common.yaml with contents like
kubeContext: dev-green clusterHostname: dev-green.datasci.oregonstate.edu

And then replace the #! line with #!/usr/bin/env -S helm kush upgrade registry /datascience/dsosuk8s/charts/docker-registry –install –kush-interpolate –values common.yaml –values if you're feeling brave, give it a try and see how it works - worst case you delete the cluster with eksctl delete cluster –name <cluster-name> and start over ;)

Adding user access to the cluster

Be default the one that creates the cluster will be the only with access to the cluster. To add someone else to have access will need to do the following.

k edit configmap -n kube-system aws-auth

This will popup an editor where you can edit some of the definitions in the cluster. Directly under the mapUsers section, add this:

mapUsers: |
  - userarn: arn:aws:iam::395763313923:user/name
    username: name
    groups:
      - system:masters

Spacing is important. You can get the userarn by running aws sts get coller-identity

Changes to EKS 1.23

New to version 1.23, you now have to add the Amazon EBS CSI driver as an Amazon EKS add-on to the EKS cluster.
Below are the steps to run after running the eksctl create cluster command above.
First need to Create the Amazon EBS CSI driver IAM role for service accounts. When the plugin is deployed, it creates and is configured to use a service account that's named ebs-csi-controller-sa. The service account is bound to a Kubernetes clusterrole that's assigned the required Kubernetes permissions. Before creating the IAM role first need to enable OIDC provider.

eksctl utils associate-iam-oidc-provider --region=us-west-2 --cluster=dev-yellow --approve
eksctl create iamserviceaccount --name ebs-csi-controller-sa --namespace kube-system --cluster NAME_OF_CLUSTER --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy --approve --role-only --role-name AmazonEKS_EBS_CSI_DriverRole

Then we can add on the EBS CSI driver.
NOTE: To get the arn name for the role created above, login to the AWS console and go to the CloudFormation console. In the list of cloud stacks find the one named “eksctl-CLUSTER_NAME-addon-iamserviceaccount-kube-system-ebs-csi-controller-sa. Click on the name linked and then goto to the “Resources” tab. This should list one Role, AmazonEKS_EBS_CSI_DriverRole. Click on that name link and it will bring up a new page with the arn name to use in the diver add on below.

eksctl create addon --name aws-ebs-csi-driver --cluster NAME_OF_CLUSTER --service-account-role-arn arn:aws:iam::395703310923:role/AmazonEKS_EBS_CSI_DriverRole --force