January 19th, 2024
HDFS docker-compose
https://faun.pub/run-your-first-big-data-project-using-hadoop-and-docker-in-less-than-10-minutes-e1bbe2974ef3
# kubernetes.txt
https://www.youtube.com/watch?v=o6bxo0Oeg6o&t=130s
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/
install-kubeadm/
Installing a container runtime
Install Docker Engine on Ubuntu
=============
1.Set up Docker's apt repository.
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
# Add the repository to Apt sources:
echo \
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
2. Install the Docker packages.
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
3. Verify that the Docker Engine installation is successful by running the hello-world image.
sudo docker run hello-world
4. Install CRID for docker
---------
4.1 Install Go
4.1.1 Download tarball
wget https://go.dev/dl/go1.21.3.linux-amd64.tar.gz
4.1.2 untar
tar -C /usr/local -xzf go1.21.3.linux-amd64.tar.gz
4.1.3 export go path
echo 'export PATH=$PATH:/usr/local/go/bin' >>~/.profile
source ~/.profile
4.1.4 To install, on a Linux system that uses systemd, and already has Docker Engine installed
# Clone cri-dockerd
git clone https://github.com/Mirantis/cri-dockerd.git
# with non-sudo
make cri-dockerd
# Run these commands as root
cd cri-dockerd
mkdir -p /usr/local/bin
install -o root -g root -m 0755 cri-dockerd /usr/local/bin/cri-dockerd
install packaging/systemd/* /etc/systemd/system
sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service
systemctl daemon-reload
systemctl enable --now cri-docker.socket
Installing kubeadm, kubelet and kubectl
=============
1. Update the apt package index and install packages needed to use the Kubernetes apt repository:
sudo apt-get update
# apt-transport-https may be a dummy package; if so, you can skip that package
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
2. Download the public signing key for the Kubernetes package repositories.
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
3. Add the appropriate Kubernetes apt repository
# This overwrites any existing configuration in /etc/apt/sources.list.d/kubernetes.list
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
4. Update the apt package index, install kubelet, kubeadm and kubectl, and pin their version:
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
Creating a cluster with kubeadm
===============
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///var/run/cri-dockerd.sock
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
sudo kubeadm join 10.131.187.176:6443 --token knefgg.pdafluamsim49olo --cri-socket=unix:///var/run/cri-dockerd.sock \
--discovery-token-ca-cert-hash sha256:b058fc69cbec62d085bb38d84f0a89879cbe16068567f061a8fac84f87eab9aa
-----
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5dd5756b68-86d94 0/1 Pending 0 5m52s
kube-system coredns-5dd5756b68-tq4v5 0/1 Pending 0 5m52s
kube-system etcd-k8smaster-vm 1/1 Running 0 6m5s
kube-system kube-apiserver-k8smaster-vm 1/1 Running 0 6m8s
kube-system kube-controller-manager-k8smaster-vm 1/1 Running 0 6m5s
kube-system kube-proxy-kl7d8 1/1 Running 0 5m52s
kube-system kube-scheduler-k8smaster-vm 1/1 Running 0 6m5s
# Install flannel
https://github.com/flannel-io/flannel#deploying-flannel-manually
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.ym
# Check pods after installing flannel
kubectl get pods -A --watch
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-7m5rq 1/1 Running 0 54s
kube-system coredns-5dd5756b68-86d94 1/1 Running 0 7m21s
kube-system coredns-5dd5756b68-tq4v5 1/1 Running 0 7m21s
kube-system etcd-k8smaster-vm 1/1 Running 0 7m34s
kube-system kube-apiserver-k8smaster-vm 1/1 Running 0 7m37s
kube-system kube-controller-manager-k8smaster-vm 1/1 Running 0 7m34s
kube-system kube-proxy-kl7d8 1/1 Running 0 7m21s
kube-system kube-scheduler-k8smaster-vm 1/1 Running 0 7m34s
# Copy file to the hadoop container
docker cp kubernetes.txt namenode:/tmp/
# Get inside the hadoop container
docker exec -it namenode /bin/bash
# 1.Create the root directory for this project:
hadoop fs -mkdir /tmp
# 2.Create the directory for the input files:
hadoop fs -mkdir /tmp/Input
# 3.Copy the input files to the HDFS:
hadoop fs -put /tmp/kubernetes.txt /tmp/Input
# You can open UI for HDFS at
http://localhost:9870.
# spark-shell
scala> val text = sc.textFile("hdfs://localhost:9000/tmp/Input/kubernetes.txt")
text: org.apache.spark.rdd.RDD[String] = hdfs://localhost:9000/tmp/Input/kubernetes.txt MapPartitionsRDD[3] at textFile at :23
scala> text.collect;
res1: Array[String] = Array(https://www.youtube.com/watch?v=o6bxo0Oeg6o&t=130s, https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/, install-kubeadm/, "", Installing a container runtime, Install Docker Engine on Ubuntu, =============, 1.Set up Docker's apt repository., "", # Add Docker's official GPG key:, sudo apt-get update, sudo apt-get install ca-certificates curl gnupg, sudo install -m 0755 -d /etc/apt/keyrings, curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg, sudo chmod a+r /etc/apt/keyrings/docker.gpg, "", # Add the repository to Apt sources:, echo \, " "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu ...
scala> val counts = text.flatMap(line => line.split(" "))
counts: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[4] at flatMap at :23
scala> counts.collect;
res2: Array[String] = Array(https://www.youtube.com/watch?v=o6bxo0Oeg6o&t=130s, https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/, install-kubeadm/, "", Installing, a, container, runtime, Install, Docker, Engine, on, Ubuntu, =============, 1.Set, up, Docker's, apt, repository., "", #, Add, Docker's, official, GPG, key:, sudo, apt-get, update, sudo, apt-get, install, ca-certificates, curl, gnupg, sudo, install, -m, 0755, -d, /etc/apt/keyrings, curl, -fsSL, https://download.docker.com/linux/ubuntu/gpg, |, sudo, gpg, --dearmor, -o, /etc/apt/keyrings/docker.gpg, sudo, chmod, a+r, /etc/apt/keyrings/docker.gpg, "", #, Add, the, repository, to, Apt, sources:, echo, \, "", "", "deb, [arch="$(dpkg, --print-architecture)", signed-by=/etc/apt/keyrings...
scala> val mapf = counts.map(word => (word,1))
mapf: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[5] at map at :23
scala> mapf.collect
res3: Array[(String, Int)] = Array((https://www.youtube.com/watch?v=o6bxo0Oeg6o&t=130s,1), (https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/,1), (install-kubeadm/,1), ("",1), (Installing,1), (a,1), (container,1), (runtime,1), (Install,1), (Docker,1), (Engine,1), (on,1), (Ubuntu,1), (=============,1), (1.Set,1), (up,1), (Docker's,1), (apt,1), (repository.,1), ("",1), (#,1), (Add,1), (Docker's,1), (official,1), (GPG,1), (key:,1), (sudo,1), (apt-get,1), (update,1), (sudo,1), (apt-get,1), (install,1), (ca-certificates,1), (curl,1), (gnupg,1), (sudo,1), (install,1), (-m,1), (0755,1), (-d,1), (/etc/apt/keyrings,1), (curl,1), (-fsSL,1), (https://download.docker.com/linux/ubuntu/gpg,1), (|,1), (sudo,1), (gpg,1), (--dearmor,1), (-o,1), (/etc/apt/ke...
scala> val reducef = mapf.reduceByKey(_+_);
reducef: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[6] at reduceByKey at :23
scala> reducef.collect
res4: Array[(String, Int)] = Array((package,4), (index,1), (cluster.,1), (kube-scheduler-k8smaster-vm,2), ("$(.,1), (-e,1), (/',1), (/etc/kubernetes/admin.conf,1), (/etc/os-release,1), (This,1), (repository.,1), ([signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg],1), (RESTARTS,2), (kube-flannel,1), (kube-apiserver-k8smaster-vm,2), (daemon-reload,1), (export,2), (gpg,3), (already,1), (any,2), (go,1), (make,1), (network,1), (Download,2), (git,1), (control-plane,1), (4.,2), (packaging/systemd/*,1), (-o,3), (are,1), ("kubectl,1), (2.,2), (sha256:b058fc69cbec62d085bb38d84f0a89879cbe16068567f061a8fac84f87eab9aa,1), ([podnetwork].yaml",1), (https://download.docker.com/linux/ubuntu/gpg,1), (STATUS,2), (kubelet,3), (overwrites,1), (commands,1), (can,3), (tee,2), (...
Leave a Reply