Ceph

14 Notes
+ RBD (Oct. 30, 2017, 8:31 a.m.)

rbd is a utility for manipulating rados block device (RBD) images, used by the Linux rbd driver and the rbd storage driver for Qemu/KVM. RBD images are simple block devices that are striped over objects and stored in a RADOS object store. The size of the objects the image is striped over must be a power of two. ------------------------------------------------------------- rbd -p image ls rbd -p image info Windows7x8 rbd -p image rm Win7x86WithApps rbd export --pool=image disk_user01_2 /root/Windows7x86.qcow2 The "2" is the ID of the Template in deskbit admin panel. -------------------------------------------------------------

+ Changing a Monitor’s IP address (Sept. 19, 2017, 3:12 p.m.)

http://docs.ceph.com/docs/kraken/rados/operations/add-or-rm-mons/ ----------------------------------------------------------------------- ceph mon getmap -o /tmp/a monmaptool --print /tmp/a monmaptool --rm vdiali /tmp/a monmaptool --add vdiali 10.10.1.121 /tmp/a monmaptool --print /tmp/a systemctl stop ceph-mon* ceph-mon -i vdimohsen --inject-monmap /tmp/a Change IP in the following files: /etc/network/interfaces /etc/default/avalaunch /etc/ceph/ceph.conf /etc/hosts

+ Properly remove an OSD (Aug. 23, 2017, 11:05 a.m.)

Sometimes removing OSD, if not done properly can result in double rebalancing. The best practice to remove an OSD involves changing the crush weight to 0.0 as first step. $ ceph osd crush reweight osd.<ID> 0.0 Then you wait for rebalancing to be completed. Eventually completely remove the OSD: $ ceph osd out <ID> $ service ceph stop osd.<ID> $ ceph osd crush remove osd.<ID> $ ceph auth del osd.<ID> $ ceph osd rm <ID> ---------------------------------------------------------- From the docs: Remove an OSD To remove an OSD from the CRUSH map of a running cluster, execute the following: ceph osd crush remove {name} For getting the name: ceph osd tree

+ Errors - undersized+degraded+peered (July 4, 2017, 3:55 p.m.)

http://mohankri.weebly.com/my-interest/single-host-multiple-osd --------------------------------------------------------- ceph osd crush rule create-simple same-host default osd ceph osd pool set rbd crush_ruleset 1 ---------------------------------------------------------

+ Commands (July 3, 2017, 2:23 p.m.)

ceph osd tree ceph osd dump ceph osd lspools ceph osd pool ls ceph osd pool get rbd all ceph osd pool set rbd size 2 ceph osd crush rule ls ----------------------------------------------------- ceph-osd -i 0 ceph-osd -i 0 --mkfs --mkkey ----------------------------------------------------- ceph -w ceph -s ceph health detail ----------------------------------------------------- ceph-disk activate /var/lib/ceph/osd/ceph-0 ceph-disk list chown ceph:disk /dev/sda1 /dev/sdb1 ----------------------------------------------------- ceph-mon -f --cluster ceph --id vdi --setuser ceph --setgroup ceph ----------------------------------------------------- systemctl -a | grep ceph systemctl status ceph-osd* systemctl status ceph-mon* systemctl enable ceph-mon.target ----------------------------------------------------- rbd -p image ls rbd export --pool=image disk_win_7 /root/win7.img ----------------------------------------------------- cd /var/lib/ceph/osd/ ceph-2 ceph-3 ceph-8 mount mount | grep -i vda mount | grep -i vdb mount | grep -i vdc mount | grep ceph fdisk -l mount /dev/vdc1 ceph-3/ systemctl restart ceph-osd@3 ceph osd tree ******************************** systemctl restart ceph-osd@5 mount | grep -i ceph systemctl restart ceph-osd@5 Job for ceph-osd@5.service failed because the control process exited with error code. See "systemctl status ceph-osd@5.service" and "journalctl -xe" for details. systemctl daemon-reload systemctl restart ceph-osd@5 ceph osd tree ceph -w -----------------------------------------------------

+ ceph-ansible (Jan. 7, 2017, 9:28 a.m.)

https://github.com/ceph/ceph-ansible --------------------------------------------------- 0- apt-get update # Ensure you do this step before running ceph-ansible!!! 1- apt-get install libffi-dev libssl-dev python-pip python-setuptools sudo python-dev git clone https://github.com/ceph/ceph-ansible/ --------------------------------------------------- 2- pip install markupsafe ansible --------------------------------------------------- 3-Setup your Ansible inventory file: [mons] mohsen3.deskbit.local [osds] mohsen3.deskbit.local --------------------------------------------------- 4-Now enable the site.yml and group_vars files: cp site.yml.sample site.yml You need to copy all files within `group_vars` directory; omit the `.sample` part: for f in *.sample; do cp "$f" "${f/.sample/}"; done --------------------------------------------------- 5-Open the file `group_vars/all.yml` for editing: nano group_vars/all.yml Uncomment the variable `ceph_origin` and replace `upstream` with `distro`: ceph_origin: 'distro' Uncomment and replace: monitor_interface: eth0 Uncomment: journal_size: 5120 --------------------------------------------------- 6-Choosing a scenario: Open the file `group_vars/osds.yml` and uncomment and set to `true` the following variables: osd_auto_discovery: true journal_collocation: true --------------------------------------------------- 7- Any needed configs for ceph should be added to the file `group_vars/all.yml`. Uncomment and change: ceph_conf_overrides: global: osd_pool_default_pg_num: 8 osd_pool_default_size: 1 --------------------------------------------------- Path to variables file: /etc/ansible/playbooks/ceph/ceph-ansible/roles/ceph-common/templates/ceph.conf.j2 ---------------------------------------------------

+ Adding Monitors (Jan. 4, 2017, 12:43 p.m.)

A Ceph Storage Cluster requires at least one Ceph Monitor to run. For high availability, Ceph Storage Clusters typically run multiple Ceph Monitors so that the failure of a single Ceph Monitor will not bring down the Ceph Storage Cluster. Ceph uses the Paxos algorithm, which requires a majority of monitors (i.e., 1, 2:3, 3:4, 3:5, 4:6, etc.) to form a quorum. Add two Ceph Monitors to your cluster. ------------------------------------------- ceph-deploy mon add node2 ceph-deploy mon add node3 ------------------------------------------- Once you have added your new Ceph Monitors, Ceph will begin synchronizing the monitors and form a quorum. You can check the quorum status by executing the following: ceph quorum_status --format json-pretty ------------------------------------------- When you run Ceph with multiple monitors, you SHOULD install and configure NTP on each monitor host. Ensure that the monitors are NTP peers. -------------------------------------------

+ Adding an OSD (Jan. 4, 2017, 12:38 p.m.)

1- mkdir /var/lib/ceph/osd/ceph-3 2- ceph-disk prepare /var/lib/ceph/osd/ceph-3 3- ceph-disk activate /var/lib/ceph/osd/ceph-3 4- Once you have added your new OSD, Ceph will begin rebalancing the cluster by migrating placement groups to your new OSD. You can observe this process with the ceph CLI: ceph -w You should see the placement group states change from active+clean to active with some degraded objects, and finally active+clean when migration completes. (Control-c to exit.)

+ Storage Cluster (Jan. 3, 2017, 1:40 p.m.)

To purge the Ceph packages, execute: (Used for when you want to purge data) ceph-deploy purge node1 If at any point you run into trouble and you want to start over, execute the following to purge the configuration: ceph-deploy purgedata node1 ceph-deploy forgetkeys -------------------------------------------- 1-Create a directory on your admin node for maintaining the configuration files and keys that ceph-deploy generates for your cluster: mkdir my-cluster cd my-cluster -------------------------------------------- 2-Create the cluster: ceph-deploy new node1 Using `ls` command, you should see a Ceph configuration file, a monitor secret keyring, and a log file for the new cluster. -------------------------------------------- 3-Change the default number of replicas in the Ceph configuration file from 3 to 2 so that Ceph can achieve an active + clean state with just two Ceph OSDs. Add the following line under the [global] section: osd pool default size = 2 osd_max_object_name_len = 256 osd_max_object_namespace_len = 64 These two last options are for EXT4; based on this link: http://docs.ceph.com/docs/jewel/rados/configuration/filesystem-recommendations/ -------------------------------------------- 4-Install Ceph: ceph-deploy install node1 The ceph-deploy utility will install Ceph on each node. -------------------------------------------- 5-Add the initial monitor(s) and gather the keys: ceph-deploy mon create-initial Once you complete the process, your local directory should have the following keyrings: {cluster-name}.client.admin.keyring {cluster-name}.bootstrap-osd.keyring {cluster-name}.bootstrap-mds.keyring {cluster-name}.bootstrap-rgw.keyring -------------------------------------------- 6-Add OSDs: For fast setup, this quick start uses a directory rather than an entire disk per Ceph OSD Daemon. See: http://docs.ceph.com/docs/master/rados/deployment/ceph-deploy-osd for details on using separate disks/partitions for OSDs and journals. Login to the Ceph Nodes and create a directory for the Ceph OSD Daemon. ssh node2 sudo mkdir /var/local/osd0 exit ssh node3 sudo mkdir /var/local/osd1 exit Then, from your admin node, use ceph-deploy to prepare the OSDs. ceph-deploy osd prepare node2:/var/local/osd0 node3:/var/local/osd1 Finally, activate the OSDs: ceph-deploy osd activate node2:/var/local/osd0 node3:/var/local/osd1 -------------------------------------------- 7-Use ceph-deploy to copy the configuration file and admin key to your admin node and your Ceph Nodes so that you can use the ceph CLI without having to specify the monitor address and ceph.client.admin.keyring each time you execute a command. ceph-deploy admin node1 node2 Login to nodes and ensure that you have the correct permissions for the ceph.client.admin.keyring. sudo chmod +r /etc/ceph/ceph.client.admin.keyring ceph health -------------------------------------------

+ Ceph Node Setup (Jan. 3, 2017, 1:25 p.m.)

1-Create a user on each Ceph Node. -------------------------------------------- 2-Add sudo privileges for the user on each Ceph Node. -------------------------------------------- 3-Configure your ceph-deploy admin node with password-less SSH access to each Ceph Node. ssh-keygen and ssh-copy-id -------------------------------------------- 4-Modify the ~/.ssh/config file of your ceph-deploy admin node so that it logs into Ceph Nodes as the user you created. Host node1 Hostname node1 User root Host node2 Hostname node2 User root Host node3 Hostname node3 User root -------------------------------------------- 5-Add to /etc/hosts: 10.10.0.84 node1 10.10.0.85 node2 10.10.0.86 node3 10.10.0.87 node4 -------------------------------------------- 6-Change the hostname of each node to the ones from the earlier stpe (node1, node2, node3, ...): nano /etc/hostname reboot each node --------------------------------------------

+ Acronyms (Jan. 1, 2017, 2:10 p.m.)

CRUSH: Controlled Replication Under Scalable Hashing EBOFS: Extent and B-tree based Object File System HPC: High-Performance Computing MDS: MetaData Server OSD: Object Storage Device PG: Placement Group PGP = Placement Group for Placement purpose POSIX: Portable Operating System Interface for Unix RADOS: Reliable Autonomic Distributed Object Store RBD: RADOS Block Devices

+ Ceph Deploy (Dec. 28, 2016, 11:21 a.m.)

Descriptions: The admin node must be password-less SSH access to Ceph nodes. When ceph-deploy logs into a Ceph node as a user, that particular user must have passwordless sudo privileges. We recommend installing NTP on Ceph nodes (especially on Ceph Monitor nodes) to prevent issues arising from clock drift. See Clock for details. Ensure that you enable the NTP service. Ensure that each Ceph Node uses the same NTP time server ------------------------------------------------------ For ALL Ceph Nodes perform the following steps: sudo apt-get install openssh-server ------------------------------------------------------ Create a Ceph Deploy User: The ceph-deploy utility must log into a Ceph node as a user that has passwordless sudo privileges, because it needs to install software and configuration files without prompting for passwords. We recommend creating a specific user for ceph-deploy on ALL Ceph nodes in the cluster. Please do NOT use “ceph” as the username. A uniform user name across the cluster may improve ease of use (not required), but you should avoid obvious user names, because hackers typically use them with brute force hacks (e.g., root, admin, {productname}). The following procedure, substituting {username} for the username you define, describes how to create a user with passwordless sudo. sudo useradd -d /home/{username} -m {username} sudo passwd {username} ------------------------------------------------------ ------------------------------------------------------

+ Installation (Dec. 27, 2016, 2:27 p.m.)

http://docs.ceph.com/docs/master/start/quick-start-preflight/ ---------------------------------------------------- 1- wget -q -O- 'https://download.ceph.com/keys/release.asc' | sudo apt-key add - 2- echo deb https://download.ceph.com/debian-hammer/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list 3- sudo apt-get install ceph ceph-deploy

+ Definitions (Dec. 27, 2016, 11:40 a.m.)

Ceph: Ceph is a storage technology. ------------------------------------------------- Cluster: A cluster is a group of servers and other resources that act like a single system and enable high availability and, in some cases, load balancing and parallel processing. ------------------------------------------------- Clustering vs. Clouding: Cluster differs from Cloud and Grid in that a cluster is a group of computers connected by a local area network (LAN), whereas cloud is more wide scale and can be geographically distributed. Another way to put it is to say that a cluster is tightly coupled, whereas a cloud is loosely coupled. Also, clusters are made up of machines with similar hardware, whereas clouds are made up of machines with possibly very different hardware configurations. ------------------------------------------------- Ceph Storage Cluster: A distributed object store that provides storage of unstructured data for applications. ------------------------------------------------- Ceph Object Gateway: A powerful S3- and Swift-compatible gateway that brings the power of the Ceph Object Store to modern applications. ------------------------------------------------- Ceph Block Device: A distributed virtual block device that delivers high-performance, cost-effective storage for virtual machines and legacy applications. ------------------------------------------------- Ceph File System: A distributed, scale-out filesystem with POSIX semantics that provides storage for a legacy and modern applications. ------------------------------------------------- RADOS: A reliable, autonomous, distributed object store comprised of self-healing, self-managing intelligent storage nodes. ------------------------------------------------- LIBRADOS: A library allowing apps to directly access RADOS, with support for C, C++, Java, Python, Ruby, and PHP. ------------------------------------------------- RADOSGW: A bucket-based REST gateway, compatible with S3 and Swift. ------------------------------------------------- RBD: A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver. ------------------------------------------------- Ceph FS: A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE. ------------------------------------------------- pg_num = number of placement groups mapped to an OSD ------------------------------------------------- Placement Groups (PGs): Ceph maps objects to placement groups. Placement groups are shards or fragments of a logical object pool that place objects as a group into OSDs. Placement groups reduce the amount of per-object metadata when Ceph stores the data in OSDs. A larger number of placement groups (e.g., 100 per OSD) leads to better balancing. -------------------------------------------------