January 19th, 2024
HDFS docker-compose

https://faun.pub/run-your-first-big-data-project-using-hadoop-and-docker-in-less-than-10-minutes-e1bbe2974ef3

				
					# kubernetes.txt
https://www.youtube.com/watch?v=o6bxo0Oeg6o&t=130s
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/
install-kubeadm/

Installing a container runtime
Install Docker Engine on Ubuntu
=============
1.Set up Docker's apt repository.

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

# Add the repository to Apt sources:
echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

2. Install the Docker packages.

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

3. Verify that the Docker Engine installation is successful by running the hello-world image.

sudo docker run hello-world

4. Install CRID for docker
---------
4.1 Install Go
4.1.1 Download tarball 
wget https://go.dev/dl/go1.21.3.linux-amd64.tar.gz

4.1.2 untar
tar -C /usr/local -xzf go1.21.3.linux-amd64.tar.gz

4.1.3 export go path
echo 'export PATH=$PATH:/usr/local/go/bin' >>~/.profile
source ~/.profile 

4.1.4 To install, on a Linux system that uses systemd, and already has Docker Engine installed

# Clone cri-dockerd
git clone https://github.com/Mirantis/cri-dockerd.git

# with non-sudo
make cri-dockerd

# Run these commands as root

cd cri-dockerd
mkdir -p /usr/local/bin
install -o root -g root -m 0755 cri-dockerd /usr/local/bin/cri-dockerd
install packaging/systemd/* /etc/systemd/system
sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service
systemctl daemon-reload
systemctl enable --now cri-docker.socket



Installing kubeadm, kubelet and kubectl
=============

1. Update the apt package index and install packages needed to use the Kubernetes apt repository:

sudo apt-get update
# apt-transport-https may be a dummy package; if so, you can skip that package
sudo apt-get install -y apt-transport-https ca-certificates curl gpg

2. Download the public signing key for the Kubernetes package repositories. 

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg


3. Add the appropriate Kubernetes apt repository
# This overwrites any existing configuration in /etc/apt/sources.list.d/kubernetes.list
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

4. Update the apt package index, install kubelet, kubeadm and kubectl, and pin their version:

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Creating a cluster with kubeadm
===============

sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///var/run/cri-dockerd.sock


Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

sudo kubeadm join 10.131.187.176:6443 --token knefgg.pdafluamsim49olo --cri-socket=unix:///var/run/cri-dockerd.sock \
        --discovery-token-ca-cert-hash sha256:b058fc69cbec62d085bb38d84f0a89879cbe16068567f061a8fac84f87eab9aa
-----

kubectl get pods -A
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE
kube-system   coredns-5dd5756b68-86d94               0/1     Pending   0          5m52s
kube-system   coredns-5dd5756b68-tq4v5               0/1     Pending   0          5m52s
kube-system   etcd-k8smaster-vm                      1/1     Running   0          6m5s
kube-system   kube-apiserver-k8smaster-vm            1/1     Running   0          6m8s
kube-system   kube-controller-manager-k8smaster-vm   1/1     Running   0          6m5s
kube-system   kube-proxy-kl7d8                       1/1     Running   0          5m52s
kube-system   kube-scheduler-k8smaster-vm            1/1     Running   0          6m5s

# Install flannel
https://github.com/flannel-io/flannel#deploying-flannel-manually
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.ym

# Check pods after installing flannel
 kubectl get pods -A --watch
NAMESPACE      NAME                                   READY   STATUS    RESTARTS   AGE
kube-flannel   kube-flannel-ds-7m5rq                  1/1     Running   0          54s
kube-system    coredns-5dd5756b68-86d94               1/1     Running   0          7m21s
kube-system    coredns-5dd5756b68-tq4v5               1/1     Running   0          7m21s
kube-system    etcd-k8smaster-vm                      1/1     Running   0          7m34s
kube-system    kube-apiserver-k8smaster-vm            1/1     Running   0          7m37s
kube-system    kube-controller-manager-k8smaster-vm   1/1     Running   0          7m34s
kube-system    kube-proxy-kl7d8                       1/1     Running   0          7m21s
kube-system    kube-scheduler-k8smaster-vm            1/1     Running   0          7m34s

				
			
				
					# Copy file to the hadoop container
docker cp kubernetes.txt namenode:/tmp/ 

# Get inside the hadoop container
docker exec -it namenode /bin/bash

# 1.Create the root directory for this project: 
hadoop fs -mkdir /tmp

# 2.Create the directory for the input files: 
hadoop fs -mkdir /tmp/Input

# 3.Copy the input files to the HDFS: 
hadoop fs -put /tmp/kubernetes.txt /tmp/Input

# You can open UI for HDFS at 
http://localhost:9870.
				
			
				
					# spark-shell
scala> val text = sc.textFile("hdfs://localhost:9000/tmp/Input/kubernetes.txt")
text: org.apache.spark.rdd.RDD[String] = hdfs://localhost:9000/tmp/Input/kubernetes.txt MapPartitionsRDD[3] at textFile at <console>:23

scala> text.collect;
res1: Array[String] = Array(https://www.youtube.com/watch?v=o6bxo0Oeg6o&t=130s, https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/, install-kubeadm/, "", Installing a container runtime, Install Docker Engine on Ubuntu, =============, 1.Set up Docker's apt repository., "", # Add Docker's official GPG key:, sudo apt-get update, sudo apt-get install ca-certificates curl gnupg, sudo install -m 0755 -d /etc/apt/keyrings, curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg, sudo chmod a+r /etc/apt/keyrings/docker.gpg, "", # Add the repository to Apt sources:, echo \, "  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu ...

scala> val counts = text.flatMap(line => line.split(" "))
counts: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[4] at flatMap at <console>:23

scala> counts.collect;
res2: Array[String] = Array(https://www.youtube.com/watch?v=o6bxo0Oeg6o&t=130s, https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/, install-kubeadm/, "", Installing, a, container, runtime, Install, Docker, Engine, on, Ubuntu, =============, 1.Set, up, Docker's, apt, repository., "", #, Add, Docker's, official, GPG, key:, sudo, apt-get, update, sudo, apt-get, install, ca-certificates, curl, gnupg, sudo, install, -m, 0755, -d, /etc/apt/keyrings, curl, -fsSL, https://download.docker.com/linux/ubuntu/gpg, |, sudo, gpg, --dearmor, -o, /etc/apt/keyrings/docker.gpg, sudo, chmod, a+r, /etc/apt/keyrings/docker.gpg, "", #, Add, the, repository, to, Apt, sources:, echo, \, "", "", "deb, [arch="$(dpkg, --print-architecture)", signed-by=/etc/apt/keyrings...

scala> val mapf = counts.map(word => (word,1))
mapf: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[5] at map at <console>:23

scala> mapf.collect
res3: Array[(String, Int)] = Array((https://www.youtube.com/watch?v=o6bxo0Oeg6o&t=130s,1), (https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/,1), (install-kubeadm/,1), ("",1), (Installing,1), (a,1), (container,1), (runtime,1), (Install,1), (Docker,1), (Engine,1), (on,1), (Ubuntu,1), (=============,1), (1.Set,1), (up,1), (Docker's,1), (apt,1), (repository.,1), ("",1), (#,1), (Add,1), (Docker's,1), (official,1), (GPG,1), (key:,1), (sudo,1), (apt-get,1), (update,1), (sudo,1), (apt-get,1), (install,1), (ca-certificates,1), (curl,1), (gnupg,1), (sudo,1), (install,1), (-m,1), (0755,1), (-d,1), (/etc/apt/keyrings,1), (curl,1), (-fsSL,1), (https://download.docker.com/linux/ubuntu/gpg,1), (|,1), (sudo,1), (gpg,1), (--dearmor,1), (-o,1), (/etc/apt/ke...

scala> val reducef = mapf.reduceByKey(_+_);
reducef: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[6] at reduceByKey at <console>:23

scala> reducef.collect
res4: Array[(String, Int)] = Array((package,4), (index,1), (cluster.,1), (kube-scheduler-k8smaster-vm,2), ("$(.,1), (-e,1), (/',1), (/etc/kubernetes/admin.conf,1), (/etc/os-release,1), (This,1), (repository.,1), ([signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg],1), (RESTARTS,2), (kube-flannel,1), (kube-apiserver-k8smaster-vm,2), (daemon-reload,1), (export,2), (gpg,3), (already,1), (any,2), (go,1), (make,1), (network,1), (Download,2), (git,1), (control-plane,1), (4.,2), (packaging/systemd/*,1), (-o,3), (are,1), ("kubectl,1), (2.,2), (sha256:b058fc69cbec62d085bb38d84f0a89879cbe16068567f061a8fac84f87eab9aa,1), ([podnetwork].yaml",1), (https://download.docker.com/linux/ubuntu/gpg,1), (STATUS,2), (kubelet,3), (overwrites,1), (commands,1), (can,3), (tee,2), (...


				
			

Leave a Reply

Your email address will not be published. Required fields are marked *