Table of Contents
How to create JH on AWS
The environment variable $KUBECONFIG points to the file that kubectl reads to access clusters; it contains encryption keys, user info, etc; by default, tools look in ~/.kube/kubeconfig, but we've set it to /datascience/keys/kube/config in your .bashrc file. This file contains the following:
export PATH=$PATH:$HOME/bin # for some small local installs/scripts export PATH="/datascience/local/bin:$PATH" # point kubectl at this config file (which stores cluster information and credentials; default is ~/.kube/config) export KUBECONFIG="/datascience/keys/kube/config" # k is so much easier to type... alias k=kubectl alias kx=kubectx # AWS profile export AWS_PROFILE=default # for krew, a kubectl plugin (installed via website directions https://krew.sigs.k8s.io/docs/user-guide/setup/install/) export PATH="${KREW_ROOT:-$HOME/.krew}/bin:$PATH"
kubectl is aliased to k. k config is a command for working with clusters, actually “contexts” which store the current cluster, user, and namespace being looked at; k config get-contexts lists the current contexts shown (this command is frustratingly different from most k commands where get is a separate word); the second column shows the context 'name'
You can switch to a context with k config use-context <context-name>. After
that everything shown will be in that particular cluster
since we're using a shared config setup, it's worth noting which context is
active since if one of us modifies it the other sees it as well
To create a new cluster on AWS we are using the eksctl utility, which is an
AWS-specific tool for spinning up EKS clusters and managing them. It can do
pretty nifty stuff, such as replacing a nodegroup while the cluster is running
(if you want to update the settings for it or something, like the instance
sizes).
eksctl uses YAML config files to define clusters, the ones
currently running are in /datascience/dsosuk8s/cluster/eksctl
the README.md file in that directory is pretty complete markdown render
on github:
https://github.com/datasci-osu/dsosuk8s/blob/master/cluster/eksctl/README.md
so, to create a new cluster:
- Copy the latest-and-greatest (currently hub-green.yaml ) to another one ( dev-green.yaml for an example)
- Adjust the cluster “name”, and the publicKeyPaths should now be /datascience/keys/sshkeys/eksctl_id_rsa.pub (there are several instances in the file, one for each nodegroup - these are the keys to ssh to nodes if necessary)
- create the cluster with eksctl create cluster -f cluster-name.yaml
Spinning up a new cluster on AWS takes a long time, about 20-30 minutes. Once it is ready you can rename the context. The default name is something long and obnoxious. k config get-contexts will show the contexts including the new one, then k config rename-context <current-name> <new-name> to rename it to something reasonable.
Configure the cluster
Next up is getting the cluster tooling in place reverse proxy “ingress”, autoscaling client, etc. you can copy the /datascience/deployments/hub-green.datasci.oregonstate.edu folder to /datascience/deployments/<new-name>.datasci.oregonstate.edu. This naming convention isn't anything special.
You'll need to edit 01-08*.yaml to reference the new kubeContext (same name as set with k config rename-context) and masterHost (the eventual CNAME), run them in order and let each finish; note the adminPassword: in 06-grafana.yaml defaults to admin.
When it's up and going, you can find out the host to point the new CNAME with:
k get svc master-ingress-nginx-ingress -n cluster-tools
or start with:
k ns cluster-tools
To select that namespace and do a k get all to see the various pieces living there including the loadbalancer assuming that all worked out, the usage dashboard should live at https://cluster-name.datasci.oregonstate.edu/grafana/ login admin / admin by default.
There is a work-in-progress dashboard for the hub; to install it click the “+”
icon in grafana and select “import”, paste the JSON from
/datascience/dsosuk8s/cluster/grafana_prometheus/jhub_cluster_metrics.json and
it should create it.
it's also possible to put in the numeric id of a dashboard from
https://grafana.com/grafana/dashboards if you want to play around; grafana is
a GUI for displaying metrics served up by (amongst other databases) prometheus,
a logging database server and that should get you to the point of creating a hub, based on one of the
examples (e.g. cj-test.yaml - don't forget to change the kubeContext and clusterHostname variables to match the new cluster. The yaml files are executable with /usr/bin/env -S, the first line in
03-registry.yaml for example is
#!/usr/bin/env -S helm kush upgrade registry/datascience/dsosuk8s/charts/docker-registry –install –kush-interpolate –values where the contents of the file are taken as the param for the last –values
Instead of listing kubeContext in this and various other files, we should be
able to have a file like common.yaml with contents like
kubeContext: dev-green
clusterHostname: dev-green.datasci.oregonstate.edu
And then replace the #! line with #!/usr/bin/env -S helm kush upgrade registry /datascience/dsosuk8s/charts/docker-registry –install –kush-interpolate –values common.yaml –values if you're feeling brave, give it a try and see how it works - worst case you delete the cluster with eksctl delete cluster –name <cluster-name> and start over ;)
Adding user access to the cluster
Be default the one that creates the cluster will be the only with access to the cluster. To add someone else to have access will need to do the following.
k edit configmap -n kube-system aws-auth
This will popup an editor where you can edit some of the definitions in the cluster. Directly under the mapUsers section, add this:
mapUsers: | - userarn: arn:aws:iam::395763313923:user/name username: name groups: - system:masters
Spacing is important. You can get the userarn by running aws sts get coller-identity
Changes to EKS 1.23
New to version 1.23, you now have to add the Amazon EBS CSI driver as an Amazon EKS add-on to the EKS cluster.
Below are the steps to run after running the eksctl create cluster command above.
First need to Create the Amazon EBS CSI driver IAM role for service accounts. When the plugin is deployed, it creates and is configured to use a service account that's named ebs-csi-controller-sa. The service account is bound to a Kubernetes clusterrole that's assigned the required Kubernetes permissions. Before creating the IAM role first need to enable OIDC provider.
eksctl utils associate-iam-oidc-provider --region=us-west-2 --cluster=dev-yellow --approve eksctl create iamserviceaccount --name ebs-csi-controller-sa --namespace kube-system --cluster NAME_OF_CLUSTER --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy --approve --role-only --role-name AmazonEKS_EBS_CSI_DriverRole
Then we can add on the EBS CSI driver.
NOTE: To get the arn name for the role created above, login to the AWS console and go to the CloudFormation console. In the list of cloud stacks find the one named “eksctl-CLUSTER_NAME-addon-iamserviceaccount-kube-system-ebs-csi-controller-sa. Click on the name linked and then goto to the “Resources” tab. This should list one Role, AmazonEKS_EBS_CSI_DriverRole. Click on that name link and it will bring up a new page with the arn name to use in the diver add on below.
eksctl create addon --name aws-ebs-csi-driver --cluster NAME_OF_CLUSTER --service-account-role-arn arn:aws:iam::395703310923:role/AmazonEKS_EBS_CSI_DriverRole --force