Deploy and Manage Red Hat's OpenStack Platform With StackIQ

Posted by Greg Bruno on Apr 15, 2014 10:03:39 AM

StackIQ has officially partnered with Red Hat to simplify the process of deploying Red Hat's OpenStack Platform (RHEL-OSP). StackIQ Cluster Manager is ideal for deploying the hardware infrastructure and application stacks of heterogeneous data center environments. With Red Hat's OpenStack offering, StackIQ Cluster Manager handles the automatic bare-metal installation of disparate hardware and correct configuration of the multiple networks required by OpenStack. StackIQ Cluster Manager also automatically deploys Red Hat Foreman which enables web-based driven cluster configuration on deployed nodes of OpenStack software. All OpenStack services are available for management and deployment via the OpenStack Dashboard, including Nova, Neutron, Cinder, Swift, etc. StackIQ Cluster Manager enables the ongoing management, deployment, and integration of Red Hat OpenStack services on growing and multi-use clusters.

StackIQ takes a “software defined infrastructure” approach to provision and manage cluster infrastructure that sits below applications like OpenStack and Hadoop. In this post, we’ll discuss how this is done, followed by a step-by-step guide to installing RHEL Foreman/OpenStack with StackIQ’s management system

Components:
The hardware used for this deployment was a small cluster: 1 node (i.e., 1 server) is used for the StackIQ Cluster Manager, 1 node is used as the Foreman server, and 3 nodes are used as backend/data nodes. Each node has 1 disk and all nodes are connected together via 1Gb Ethernet on a private network. StackIQ Cluster Manager, Foreman server, and OpenStack controller nodes are also connected to a corporate public network using the second NIC. Additional networks dedicated to OpenStack services can also be used but are not depicted in this graphic or used in this example. StackIQ Cluster Manager has been used in similar deployments between 2 nodes and 4,000+ nodes.
RH-OpenStack-Architchture
 

Step 1. Install StackIQ Cluster Manager

The StackIQ Cluster Manager node is installed from bare-metal (i.e., there is no pre-requisite software and no operating system previously installed) by burning the StackIQ Cluster Core Roll ISO to DVD and booting from it (the StackIQ Cluster Core Roll can be obtained from the “Rolls” section after registering at http://www.stackiq.com/download/). The Cluster Core Roll leads the user through a few simple forms (e.g., what is the IP address of StackIQ Cluster Manager, what is the gateway, DNS server, etc.) and then asks for a base OS DVD (for example, Red Hat Enterprise Linux 6.5; other Red Hat-like distributions such as CentOS are supported as well, but for RHEL OpenStack, only RHEL certified media are acceptable). The installer copies all the bits from both DVDs and automatically creates a new Red Hat distribution by blending the packages from both DVDs together.

The remainder of StackIQ Cluster Manager installation requires no further manual steps and this entire step takes between 30 to 40 minutes.

A detailed description of StackIQ Cluster Manager can be found in section 3 of the StackIQ Users Guide. It is strongly recommended that you familiarize yourself with at least this section before proceeding. (C’mon, really, it’s not that bad. The print is large and there are a bunch of pictures, shouldn’t take long.)

https://s3.amazonaws.com/stackiq-release/stack3/roll-cluster-core-usersguide.pdf 

RHEL-Updates-04112014-0.x86_64.disk1.iso (The Heartbleed vulnerability is fixed in updates from RedHat and is contained in the RHEL-Updates roll.) (http://stackiq-release.s3.amazonaws.com/stack3/RHEL-Updates-04112014-0.x86_64.disk1.iso)

On StackIQ Cluster Manager download RHEL-Updates-04112014 roll ISO from: http://stackiq-release.s3.amazonaws.com/stack3/RHEL-Updates-04112014-0.x86_64.disk1.iso

 

Step 2. Install the RHEL OpenStack Bridge and RHEL OpenStack RPMS Rolls 

StackIQ has developed software that “bridges” our core infrastructure management solution to Red Hat’s OpenStack Platform we’ve named the RHEL OpenStack Bridge Roll. The RHEL OpenStack Bridge Roll is used to spin up Foreman services by installing a Foreman appliance. This allows you to leverage RHEL’s Foreman OpenStack puppet integration to deploy a fully operational OpenStack Cloud.

StackIQ Cluster Manager uses the concept of “rolls” to combine packages (RPMs) and configuration (XML files which are used to build custom kickstart files) to dynamically add and automatically configure software services and applications.

The first step is to install a StackIQ Cluster Manager as a deployment machine. This requires that you use, at a minimum, the cluster-core and RHEL 6.5 ISOs. It’s not possible to add StackIQ Cluster Manager on an already existing RHEL 6.5 machine. You must start with the installation of StackIQ Cluster Manager. The rhel-openstack-bridge roll, rhel-6-server-openstack-4.0-rpms, and RHEL-Updates roll are not necessary at installation time, they can be added to StackIQ Cluster Manager after the fact. This saves on CD/DVD burning and time when adding multiple rolls during StackIQ Cluster Manager installation.

It is highly recommended that you check the MD5 checksums of the downloaded media

You must burn the cluster-core roll and RHEL Server 6.5 ISOs to disk, or, if installing via virtual CD/DVD, simply mount the ISOs on the machine's virtual media via the BMC.

Then follow this https://s3.amazonaws.com/stackiq-release/stack3/roll-cluster-core-usersguide.pdf for instructions on how to install StackIQ Cluster Manager in section 3. (Yes! I mentioned it again.)

Additionally, the following video takes you through the basic process of installing StackIQ Cluster Manager and backend nodes. Specific instructions for Foreman/OpenStack, follow.

Verify the MD5 checksums:

# md5sum rhel-openstack-bridge-1.0-0.x86_64.disk1.iso
should return f7a2e2cef16d63021e5d2b7bc2b16189
 
# md5sum rhel-6-server-openstack-4.0-rpms-6.5-0.x86_64.disk1.iso
should return 19c05af49e53f90a2cc9bcd4dddb353f

# md5sum RHEL-Updates-04112014-0.x86_64.disk1.iso
should return 44b2aeb7ec26c9f1f15e615604101304

  

Then execute the following commands on the frontend:

# rocks add roll rhel-openstack-bridge*rpm
# rocks enable roll rhel-openstack-bridge rhel-6-server-openstack-4.0-rpms RHEL-Updates-6.5
# rocks create distro
# rocks run roll rhel-openstack-bridge | sh

(The rhel-6-server-openstack-4.0-rpms and RHEL-Updates rolls do not contain any configuration scripts so do not need to be configured with a “rocks run roll” command.

StackIQ Cluster Manager is now configured and ready to install a Foreman appliance and the OpenStack backend nodes.

 

What You’ll Need:

On StackIQ Cluster Manager download rhel-6-server-openstack-4.0-rpms roll ISO from: http://stackiq-release.s3.amazonaws.com/stack3/rhel-6-server-openstack-4.0-rpms-6.5-0.x86_64.disk1.iso

  • After StackIQ Cluster Manager is installed and booted, it is time to add the RHEL OpenStack Bridge, RHEL OpenStack RPMS roll, and RHEL-Updates roll:
  • RHEL Server 6.5 ISO (This is something you supply via Red Hat download from your Red Hat subscription.)
Step 3. Install the Foreman Appliance

StackIQ Cluster Manager contains the notion of an “appliance.” An appliance has a kickstart structure that installs a preconfigured set of RPMS and services that allow for concentrated installation of a particular application. The bridge roll provides a “Foreman” appliance that sets up the automatic installation of the RHEL-OSP Foreman server with the required OpenStack infrastructure. It’s the fastest way to get a Foreman server up and running.

Installing Backend Nodes Using Discovery Mode in the StackIQ Cluster Manager GUI

“Discovery” mode allows the automatic installation of backend nodes without pre-populating the StackIQ Cluster Manager database with node names, IP addresses, MAC addresses, etc. The StackIQ Cluster Manager runs DHCP to answer and install any node making a PXE request on the subnet. This is ideal on networks when you, a) have full control of the network and the policies on the network and, b) you don’t care about the naming convention of your nodes. If one of these is not true, please follow the instructions for populating the database in the “Install Your Compute Nodes Using CSV Files” in the cluster-core roll documentation reference above (Section 3.4.2).

“Discovery” mode is no longer turned on by default, as it may conflict with a company’s networking policy. To turn on Discovery mode, in a terminal or ssh session on StackIQ Cluster Manager do the following:

# rocks set attr discover_start true

To turn it off after installation if you wish:

# rocks set attr discover_start false

DHCP is always running but with “discover_start” set to “false,” it will not promiscuously answer PXE requests.

With Discovery turned on, you can perform installation of backend nodes via the GUI or via the command line. To install via the GUI go the StackIQ Cluster Manager GUI at http://<StackIQ Cluster Manager hostname or IP>

 

Click the Login link and login as “root” with the password set during installation for “root”
Go to the “Discover” tab:

Click on Appliance, and choose “Foreman” and click “Start.”

 

Boot the server you are using as the Foreman server. All backend nodes should be set to PXE first on the network interface attached to the private network. This is a hard requirement.

In the GUI, you should see a server called “foreman-0-0” appear in the dialog, and in sufficient time, the Visualization area in the "Discover" tab will indicate the network traffic being used during installation.

The Foreman server appliance installation is somewhat chatty. You’ll receive status updates in the Messages box at the bottom of the page for what is happening on the node. The bare metal installation of the Foreman server is relatively short, about 20 minutes depending on the size of the disks being formatted. The installation of the Foreman application takes longer and happens after the initial boot due to RPM packaging constraints of the Foreman installer. It should be done, beginning to end, in about an hour.

When the machine is up, the indicator next to it’s name will be green and there will be a message in the alerts box indicating the machine has installed Foreman.

Using the command line:

If for some reason you do not have access to the front-end web GUI or access is extremely slow, or if you just happen to be a command line person, there is a command to do discovery of backend resources.

To install a Foreman appliance:

# insert-ethers

Choose “Foreman” and choose “OK”

 

Boot the machine and it should be discovered, assuming PXE first.

Once the Foreman server is installed, you can access it’s web interface by running Firefox on StackIQ Cluster Manager. It should be available at the IP address listed in a:

# rocks list host interface foreman-0-0

Adding an additional interface

If you want it accessible on the public or corporate network and not just on the private network, it will be necessary to add another network interface attached to the public network.

If the interface was detected during install:

# rocks set host interface ip

# rocks set host interface subnet public

If you add the interface after the fact:

# rocks add host interface help

And fill in the appropriate fields.

In either event, to make the network change live, sync the network:

# rocks sync host network foreman-0-0 restart=yes

This procedure is more clearly delineated in section 4.3 of the cluster-core roll documentation, referenced (twice!) above.

Step 4. Install the Backend Nodes

Before we install the backend nodes (also known as “compute nodes”), we want to ensure that all disks in the backend nodes are configured and controlled by the StackIQ Cluster Manager. On node reinstall, this prevents the inadvertent loss of data on disks that are not the system disk. Now, we don’t want to reconfigure the controller and reformat disks on every installation, so we need to instruct the StackIQ Cluster Manager to perform this task the next time the backend nodes install. We do this by setting an attribute (“nukedisks”):

# rocks set appliance attr compute nukedisks true

After node reinstallation, this attribute is automatically set to “false,” so the only way to reformat non-system disks, is to deliberately set this attribute to “true” before node reinstall.

Now we are ready to install the backend nodes. This is the same procedure that we used to install the Foreman server. This time, however, choose “Compute” as the appliance, whether you are using the web GUI or the CLI command “insert-ethers”.

Make sure the StackIQ Cluster Manager is in "discovery" mode using the CLI or GUI and all backend nodes are PXE booted. StackIQ Cluster Manager discovers and installs each backend node in parallel, packages are installed in parallel, and disks on the node are also formatted in parallel. All this parallelism allows us to install an entire cluster, no matter the size, in about 10 to 20 minutes -- no manual steps are required. For more information on installing and using the StackIQ Cluster Manager, please visit http://www.stackiq.com/support/ or http://www.youtube.com/stackiq. Please review the above video and section 3.4 of the cluster-core roll documentation for questions.

After all the nodes in the cluster are up and running, you will be ready to deploy OpenStack via the Foreman web interface. In this example, the StackIQ Cluster Manager node was named “kaiza” and the foreman server was named “foreman-0-0.” The compute nodes were assigned default names of compute-0-0, compute-0-1, compute-0-2.

This is how it looks on the GUI when all the installs are completed.

Step 5. Configuring Foreman to Deploy OpenStack

The Foreman server as supplied by RHEL contains all the puppet manifests required to deploy machines with OpenStack roles. With the backend nodes installed and properly reporting to Foreman, we can go to the Foreman web GUI and configure the backend nodes to run OpenStack.

The example here will be for the simplest case: a Nova Network Controller using a single network, and a couple of hypervisors running the Cirros cloud image.

More complex cases (Neutron, Swift, Cinder) will follow in the next few weeks as appendices to this document. Feel free to experiment ahead of those instructions, however.

1. Go to https://

Choose “Proceed Anyway” or, if in Firefox, accept the certificate, if the security certificate is not trusted.
Untitled.png

You should get a login screen:

Log_in_screen

2. Login, the default username is “admin” and the default password is “changeme” Take the time to change the password once you log in, especially if the Foreman server is available to the outside world.

3. Add a controller node

You should see all the nodes you’ve installed listed on the Foreman Dashboard. Click on the “Hosts” tab to go to the hosts window.
Untitled 6.png

Click on the machine you intend to use as a controller node. You will have to change some parameters to reflect the network you are using for OpenStack (in this example, the private one).

 
2nd_screenshot

It’s highly recommended this machine also have a connection to the external network (www or corporate internet) to simplify web access. See “Adding an additional interface” above on how to do that. Do not choose the Foreman server as a controller node. The OpenStack Dashboard overwrites httpd configuration files and will disable the ability to log into the Foreman web server. However, If you have a small cluster, you can add the Foreman server as an OpenStack Compute node, as we do in this example. You may not want to do that in a larger cluster though. Separation of services is almost always a good thing.

Click on the host, we will use “compute-0-0.” When the “compute-0-0” page comes up, click on “Edit.”
 
Untitled 4.png
You should see a page called “Edit compute-0-0.local.” Set the “Host Group” tab to “Controller (Nova Network).” (An example of Neutron networking will follow in later Appendices to this document.)
Edit_compute_00

Click on the “Parameters” tab. There are a lot of parameters here, but we will change the minimum to reflect our network.

Click the “Override” button next to the following parameters:

controller_priv_host

controller_pub_host

mysql_host

qpid_host

These parameters will be listed at the bottom of the page with text fields to change them. The controller_priv_host, mysql_host, and qpid_host, should all be changed to the private interface IP of the controller node, i.e. the machine you are editing right now.

The controller_pub_host should be the IP address of the public interface (if you have added one) of the controller node, i.e. the machine you are editing right now.

If you don’t know the IP address of the controller node, in a terminal on the StackIQ Cluster Manager, do the following
Screen Shot 2014-03-27 at 2.58.57 PM.png
The IP address for the controller_pub_network, in this instance, is on eth2, we set it that way and cabled it to the corporate external network, and has IP address 192.168.1.60.

This can been seen as below. Once you’ve made the changes, click “Submit.”

 
host3.jpeg
Going back to the “Hosts” tab, you should see that “compute-0-0.local” is has the “Controller (Nova Network)” role.
Screen Shot 2014-03-27 at 3.01.41 PM.png
There is a puppet agent that runs on each machine. It runs every 30 minutes. This will automatically update the machine’s configuration and make it the OpenStack Controller. If you don’t want to wait that long, start the puppet process yourself from StackIQ Cluster Manager. (Alternatively, you can ssh to compute-0-0 and manually run “puppet agent -tv”.)

Once the puppet run finishes, you can add OpenStack Computes. (The puppet run on the controller node can take awhile to execute.)

Add OpenStack Compute Nodes

There isn’t much for an OpenStack Controller to do if it can’t launch instances of images, so we need a couple of hypervisors. We’ll do this a little differently than the Controller node, where we edited one individual machine, and instead, edit the “Host Group” we want the computes to run as. This allows us to make the changes once and apply them to all the machines.

Go to “More” and choose “Configuration” from the drop down, then click on “Host Groups” in the next drop down.

Click on “Compute (Nova Network)” and it will bring you to an “Edit Compute (Nova Network)” screen:

Choose the “Parameters” tab:
 

We’re going to edit a number of fields, similar to the the Controller node. Click the “override” button on each of the following parameters and edit them at the bottom of the page:

controller_priv_host - set to private IP address of controller

controller_pub_host - set to public IP address of controller

mysql_host - set to private IP address of controller

qpid_host - set to private IP address of controller

nova_network_private_iface - the device of the private network interface

nova_network_public_iface

nova_network_private_iface

The nova_network_*_iface default to em1 and em2. These may work on the machines in your cluster, and you may not have to change them. Since the test cluster is on older hardware, eth0, eth1, and eth2 are where the networks sit. So for this test cluster, the appropriate changes are as below. The test cluster needs the eth2 interface for the public network because it is using the foreman-0-0 as a compute node. If your Foreman node is not part of your test cluster, you may not need to change this.

More advanced networking configurations, i.e. when using multiple networks or using Neutron, may require additional parameters.

Click submit. Any host that is listed with the “Compute (Nova Network)” role, will inherit these parameters.

Now lets add the hosts that will belong to the “Host Group” Compute (Nova Network).

Go to the “Hosts” tab once again, and choose all the hosts that will run as Nova Network Computes. In this example, since it’s such a small cluster, we’ll add the “foreman-0-0” machine as an OpenStack Compute:
Screen Shot 2014-03-27 at 3.13.05 PM.png

Now click on “Select Action” and choose “Change Group.”

 
Screen Shot 2014-03-27 at 3.16.45 PM.png

Click on “Select Host Group” and choose “Compute (Nova Network)” then click “Submit.”

 

The hosts should show the group they’ve been assigned to:

 

Again, you can wait for the Puppet run or spawn it yourself from StackIQ Cluster Manager. Since we have a group of machines, we will use “rocks run host” to spawn “puppet agent -tv” on all the machines:

 
If we had chosen only the Compute nodes for OpenStack Compute role and not the Foreman node, we could do this on just the computes by specifying their appliance type:

Once puppet has finished, log into the OpenStack Controller Dashboard to start using OpenStack.

Using OpenStack

To access the controller node, go to http://<controller node ip> . This is accessible on either the public IP you confignodeured for this machine or at the private IP. If you have only configured this on the private IP, you’ll have to open a browser from StackIQ Cluster Manager or port forward SSH to the private IP from your desktop.

The username is “admin” and the password was randomly generated during the Controller puppet install. To get this password, go to the Foreman web GUI, click on the “Hosts” tab and click on the host name of the Controller host: 

 

 The click “Edit” and go to the “Parameters” tab:

 

Copy the “admin_password” string:

 

 

Paste it into the password field on the OpenStack Dashboard and click “Submit.”

 

 

You should now be logged into the OpenStack Dashboard

 

 

Click on “Hypervisors,” you should see the three OpenStack compute nodes you’ve deployed.

 

As a simple example, we’ll deploy the Cirros cloud image that OpenStack uses in their documentation.

Click on “Images.”

 

Click on “Create Image” and you’ll be presented with the image configuration window.

 

Fill in the required information:

Name - we’ll just use “cirros”

Image Source - use default “Image Location”

Image Location - http://download.cirros-cloud.net/0.3.1/cirros-0.3.1-x86_64-disk.img

 

Why do we know this? Because I looked it up here: http://docs.OpenStack.org/image-guide/content/ch_obtaining_images.html

Format - QCOW2

And make it “Public”

Then click “Create Image.”

The image will show a status of “Queued.”

 

And when it’s downloaded and available to create Instances, it will be labeled as “Active.”

 

Cool! Now we can actually launch an instance and access it.

Adding an Instance:

Click on “Project” then on “Instances” in the sidebar:

 

Click on “Launch Instance.”

 

Fill out the parameters:

Availablility Zone - nova, default

Instance Name - we’ll call it cirros-1

Flavor - m1.tiny, default

Instance Count - 1, default

Instance Boot Source - Select “Boot from Image”

Image Name - Select “cirros”


Setup security so we can log into the instance. Choose Access and Security, edit the default, for this example, we’re just making this a very promiscous server.

Click on “Launch”

 
You should see a transient “Success” notification on the OpenStack Dashboard and then the instance should start spawning.
When instance is ready for use, it will show as “Active” with power state “Running,” and log-in should work. (Cirros login is “cirros” and password is “cubswin:)” with the emoji.)

Logging into the Instance

In this simple example, to log into the instance, you must log into the hypervisor where the instance is running. Subsequent blog posts will deal with more transparent access for users.

 

To find out which hypervisor your instance is running on, go to the “Admin” panel from the left sidebar and click on “Instances.”

Untitled

 

We can see the instance is running on compute-0-1 with a 10.0.0.2 IP. So from a terminal on the frontend, ssh into the hypervisor compute-0-1.

Untitled_3

 

Now log into the instance as user “cirros” with password “cubswin:)” (The password includes the emoji.)

 

 

Untitled_4

 

Now you can run Linux commands to prove to yourself you have a functioning instance:

Untitled_5

 

Reinstalling

There are times when a machine needs to be reinstalled: hardware changes or repair, uncertainty about a machine’s state, etc. A reinstall generally takes care of these issues. The goal of StackIQ Cluster Manager is to have software homogeneity across heterogeneous hardware. StackIQ Cluster Manager allows you to have immediate consistency of your software stacks on first boot. One of the ways we do this is by making reinstallation of your hardware as fast as possible (reinstalling 1000 nodes is about as fast as reinstalling 10.) and correct when a machine comes back up.

One of the difficulties with the OpenStack puppet deployment is certificate management. When a machine is first installed and communicates with Foreman, a persistent puppet certificate is created. When a machine is re-installed or replaced, the key needs to be removed in order for the machine to resume its membership in the cluster. StackIQ Cluster Manager takes care of this by watching for reinstallation events and communicating with the Foreman server to remove the puppet certificate. When the machine finishes installing, the node will rejoin the cluster automatically. In the instance of a reinstall, if the OpenStack role has been set for this machine, the node will do the appropriate puppet run and rejoin OpenStack in the assigned role, and you really don’t have to do anything special for that to happen.

To reinstall a machine:

# rocks run host “/boot/kickstart/cluster-kickstart-pxe”

or

# rocks set host boot action=install

# rocks run host “reboot”

If you wish to start with a completely clean machine and don’t care about the data on it, set the “nukedisks” flag to true before doing one of the above installation commands:

# rocks set host attr nukedisks true

Multi-use clusters
StackIQ Cluster Manager has been used to run multi-use clusters with different software stacks assigned to different sets of machines. The OpenStack implementation is like that. If you want to allocate machines for another application and you’re using the RHEL OpenStack Bridge roll, then you can turn off OpenStack deployment on certain machines, and they will not be set-up to participate in the OpenStack environment. To do this, simply do the following:

# rocks set host attr has_openstack false

The bridge roll sets every compute node to participate in the OpenStack distribution. Throwing this flag for a host means the machine will not participate in the OpenStack deployment. If the machine was first installed with OpenStack, then you will have to reinstall after setting this attribute.

Updates

Red Hat provides updates to RHEL-OSP and to RHEL Server regularly. StackIQ tracks these updates and will provide updated rolls for critical patches or service updates to RHEL-OSP. Additionally, if your frontend is properly subscribed to RHN or to Subscription manager, these updates can be easily pulled and updated with the "rocks create mirror" command. Updating deserves a blog post of its own and will be forthcoming.

Next Steps

Admittedly, we are documenting the simplest use case - Nova Networking on a single network. This is not ideal for production systems, but by now, you should be able to see how you can use the different components, StackIQ Cluster Manager, Foreman, and OpenStack Dashboard to easily configure and deploy OpenStack. Adding complexity can be done as you explore the RHEL OpenStack ecosystem to fit your company’s needs.

In the future, we will provide further documentation on deploying Neutron, Swift, and Cinder. Additionally, layering OpenStack roles (Swift and Compute for instance) will be topics we will be exploring and blogging about as we move forward with Red Hat’s OpenStack Platform. Stay tuned!

Resources

StackIQ:
Using StackIQ Cluster Manager for deploying clusters: https://s3.amazonaws.com/stackiq-release/stack3/roll-cluster-core-usersguide.pdf. Video: https://www.youtube.com/watch?v=gVPZcA-yHQY&list=UUgg-AnfqnNCp-DxpVEfJkuA

Red Hat:
RHEL OpenStack Documentation: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/

We’re happy to answer questions on installing, configuring, and deploying RHEL-OSP with StackIQ Cluster Manager. Please send email to support@stackiq.com.
Greg Bruno, Ph.D., VP of Engineering, StackIQ
@StackIQ
Read More

Update: StackIQ Cluster Manager Now Integrated With Cloudera

Posted by Greg Bruno on Apr 8, 2014 4:00:00 PM

Updated: 4/8/2014 (Note that these instructions are for Cloudera Enterprise 4. To use StackIQ Cluster Manager with Cloudera Enterprise 5, please contact support@stackiq.com)

StackIQ takes a “software defined infrastructure” approach to provision and manage cluster infrastructure that sits below Big Data applications like Hadoop. In this post, we’ll discuss how this is done, followed by a step-by-step guide to installing Cloudera Manager on top of StackIQ’s management system.

Components:

The hardware used for this deployment was a small cluster: 1 node (i.e. 1 server) is used for the StackIQ Cluster Manager and 4 nodes are used as backend/data nodes. Each node has 2 disks and all nodes are connected together via 1Gb Ethernet on a private network. The StackIQ Cluster Manager node is also connected to a public network using its second NIC. StackIQ Cluster Manager has been used in similar deployments between 2 nodes and 4,000+ nodes.

Image 1 resized 600

 

 

 

 

 

 

 

 

 

 

 

 

 

Step 1: Install StackIQ Cluster Manager

The StackIQ Cluster Manager node is installed from bare metal (i.e. there is no prerequisite software and no operating system previously installed) by burning the StackIQ Cluster Core Roll ISO to DVD and booting from it (the StackIQ Cluster Core Roll can be downloaded from the Rolls section after registering). The Core Roll leads the user through a few simple forms (e.g., what is the IP address of the Cluster Manager, what is the gateway, DNS server) and then asks for a base OS DVD (for example, Red Hat Enterprise Linux 6.5; other Red Hat-like distributions such as CentOS are supported as well). The installer copies all the bits from both DVDs and automatically creates a new Red Hat distribution by blending the packages from both DVDs together.

The remainder of the Cluster Manager installation requires no further manual steps and this entire step takes between 30 to 40 minutes.

 

Step 2: Install the CDH Bridge Roll

StackIQ has developed software that “bridges” our core infrastructure management solution to Cloudera’s Hadoop distribution that we’ve named the CDH Bridge Roll. One feature of our management solution is that it records several parameters about each backend node (e.g., number of CPUs, networking configuration, disk partitions) in a local database. After StackIQ Cluster Manager is installed and booted, it is time to download and install the CDH Bridge Roll:

  • Log into the frontend as "root", download cdh-bridge ISO from here.

  • Then execute the following comands at the root prompt"

 # rocks add roll <path_to_iso>
 # rocks enable roll cdh-bridge
 # rocks create distro
 # rocks run roll cdh-bridge | sh

The cluster is now configured to install Cloudera packages on all nodes.

 

Step 3: Install Cloudera Manager and Cloudera CDH4 Roll

You can download a prepackaged Cloudera Manager here and a prepackaged Cloudera CDH4 from here.

We will now install these 2 ISOs. 

 rocks add roll cloudera-cdh4/cloudera-cdh4-6.5-0.x86_64.disk1.iso
 rocks add roll cloudera-manager/cloudera-manager-6.5-0.x86_64.disk1.iso
 rocks enable roll cloudera-cdh4
 rocks enable roll cloudera-manager
 rocks create distro
 rocks run roll cloudera-cdh4 | sh
 rocks run roll cloudera-manager | sh

 

Step 4: Install the backend nodes

Before we install the backend nodes (also known as compute nodes), we want to ensure that all disks in the backend nodes are optimally configured for HDFS. During an installation of a data node, our software interacts with the disk controller to optimally configure it based on the node’s intended role. For data nodes, the disk controller will be configured in “JBOD mode” with each disk configured as a RAID 0, a single partition will be placed on each data disk and a single file system will be created on that partition. For example, if a data node has one boot disk and 4 data disks, after the node installs and boots, you’ll see the following 4 file systems on the data disks: /hadoop01, /hadoop02, /hadoop03 and /hadoop04.

For more information on this feature, see our blog post Why Automation is the Secret Ingredient for Big Data Clusters.

Now we don’t want to reconfigure the controller and reformat disks on every installation, so we need to instruct the StackIQ Cluster Manager to perform this task the next time the backend nodes install. We do this by setting an attribute (“nukedisks”) with the rocks command line:

# rocks set appliance attr compute nukedisks true
# rocks set appliance attr cdh-manager nukedisks true

Now we are ready to install the backend nodes. First we put the StackIQ Cluster Manager into "discovery" mode using the CLI or GUI and all backend nodes are PXE booted. We will boot the first node as a cdh-manager appliance. The cdh-manager node will run the Cloudera Manager web admin console used to configure, monitor and manager CDH.

 describe the image

 After installing it shows up as below:

2 cdh manager after discovery (framed) resized 600

We will install all the other nodes in the cluster as compute nodes. StackIQ Cluster Manager discovers and installs each backend node in parallel (10 to 20 minutes) - no manual steps are required.

 describe the image

For more information on installing and using the StackIQ Cluster Manager (a.k.a., Rocks+), please visit StackIQ Support or watch the the demo video. 

After all the nodes in the cluster are up and running you will be ready to install Cloudera Manager. In this example, the StackIQ Cluster Manager node was named “frontend” and the compute nodes were assigned default names of compute-0-0, compute-0-1, compute-0-2 (3 nodes in Rack 0), and compute-1-0 (1 node in Rack 1).

 

Step 5: Install Cloudera Manager 

SSH into cdh-manager appliance, as root, execute:

# /opt/rocks/sbin/cloudera-manager-installer.bin --skip_repo_package=1

This will install Cloudera Manager with packages from our local yum repository as opposed to fetching packages over the internet.

Step 6: Select What to Install 

Log into the cdh-manager node http://<cdh-manager>:7180 (where ‘<cdh-manager>’ is the FQDN of your StackIQ Cluster Manager) with username admin and password admin

Image 2 (cropped) resized 600

 

 

 

 

 

 

 

 

 

Choose Cloudera Enterprise trial if you want to do a trial run

Image 3 copy resized 600

 

 

 

 

 

 

 

 

 

 

 

 

The GUI will now prompt you to restart Cloudera Manager server. Run the following command on cdh-manager node.

# service cloudera-scm-server restart

After restarting the server, you will be asked to login again. Click Continue in the screen below.

Image 4 framed resized 600

 

 

 

 

 

 

 

 

 

 

 

 

Specify list of hosts for CDH installation e.g., compute-0-[0-3],cdh-manager-0-0

Image 5 cropped resized 600

 

 

 

 

 

 

 

 

After all the hosts are identified, hit Continue

Image 1 resized 600

 

 

 

 

 

 

 

 

 

 

Choose Use Packages and select CDH4 as the version in the screen below.

Image 7 copy copy resized 600

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Specify custom repository as the CDH release you want to install. Specify http://<frontend>/install/distributions/rocks-dist/x86_64/ for the URL of the repository where <frontend> is the IP address of the cluster’s frontend.

describe the image

In the example above, 10.1.1.1 was the IP address of the private eth0 interface on the frontend.

Choose All hosts accept same private key as the authentication method. Use Browse to upload the private key present in /root/.ssh/id_rsa on StackIQ Cluster Manager.

Image 9 copy resized 600

 

 

 

 

 

 

 

 

 

 

 

 

 

 

You will then see a screen where the progress of the installation will be indicated. After installation completes successfully, hit Continue.

Image 10 copyn(cropped) copy resized 600

 

 

 

 

 

 

 

 

 

You will then be directed to the following screen where all hosts will be inspected for correctness.

Image 11 copy resized 600

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Choose a combination of services you want to install and hit Continue

Image 12 cropped resized 600

 

 

 

 

 

 

 

 

 

 

 

Review that all services were successfully installed.

Image 13 copy resized 600

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Finally your Hadoop services will be started.

Image 14 (cropped) copy resized 600

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  

Step 7: Run a Hadoop sample program

It is never enough to set up a cluster and the applications users need and then let them have at it. There are generally nasty surprises for both parties when this happens. A validation check is a requirement to make sure everything is working the way it is expected to.

Do this to check to test if the cluster is functional: 

  • Log into the the frontend as “root” via SSH or Putty.

  • On the command line, run the following map-reduce program as the “hdfs” user, which runs a simulation to estimate the value of pi based on sampling:

# sudo -u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar pi 10 10000

Output should look something like this.

Image 15 copy resized 600

Congratulations, you are done!

We’re certain you’ll find this the quickest way to deploy a cluster capable of running Cloudera Hadoop. Give it a shot and send us your questions!

The StackIQ Team

@StackIQ

Read More

Topics: hadoop cluster, hadoop, hadoop management, hadoop startup, big data, cloudera

Web-Scale-IT… What the What?

Posted by Matthias Ankli on Feb 17, 2014 9:38:00 AM

With a constantly increasing amount of data and more complex application requirements, talk about so-called Web-scale IT architecture is on the rise. But what exactly does Web-scale IT mean?

The research firm Gartner introduced the term Web-scale IT in an effort to describe what thedescribe the image fine folks at Internet giants like Facebook, Google, LinkedIn, etc. have achieved in agility and scalability by applying new processes, architectures, and practices. These companies exceed the “scale in terms of sheer size to also include scale as it pertains to speed and agility,” according to Gartner. The research firm also named it one of the top ten strategic technology trends of 2014.

The term, Web-scale IT, is often used in the context of DevOps but it also applies to the underlying IT infrastructure – the system needs to be in a “known good state” to achieve agility at scale. And by now you probably know where we are going with this (hint: we pride ourselves to be the leaders in infrastructure automation).

Can Organizations of All Sizes Benefit From Web-Scale IT Methodology?

Good question and let’s think about this. While most organizations don’t reach the scale of a Google or the scale of a Facebook, they will still benefit from increased velocity that comes with the Web-scale IT approach (if done right). But let’s go even further, thanks to the availability of powerful open source tools like Hadoop, OpenStack, etc., Big Data and cloud techniques are no longer the privileges of hyperscale web properties and became available to enterprises of all sizes. That being said, with all these new tools and capabilities, many sub web-scale enterprises today run some form of Big Infrastructure and that brings new challenges. 

Disruption of the IT Infrastructure As We Know It

The shift to Web-scale IT represents a radical departure from the old ways of doing things in the IT world and as with every disruptive movement, it can be a scary transition. Web-scale IT requires IT professionals to be able to move faster than ever to deploy and manage Big Infrastructure. Infrastructure has become increasingly heterogeneous with commodity hardware, open source software, and home-grown provisioning and management software that make infrastructure difficult to manage at scale. Many steps are still done manually, are inefficient, and error prone.

To do Web-scale IT right, organizations must move to the next level of infrastructure automation, the level that understands the requirements of applications and responds to those requirements in real time – a software defined environment. In a software-defined environment, IT becomes simplified, as well as responsive to shifting requirements and adaptive through automation. Building and managing these systems is the “secret sauce” but it isn’t easy to achieve.

Hyperscale websites have the capital to build their own management tools that automate the management, configuration, and deployment, but that takes resources that enterprise IT infrastructure managers just don’t have. However, they still need automation if they want to achieve their goals. They need a solution that will harness the collection of commodity hardware and open source software, and make it work like a turnkey solution (from bare metal all the way to the applications layer) without the price tag of a proprietary system.

 What Would Such Solution Look Like?

Here at StackIQ we spend a lot of time thinking about how to make the lives of infrastructure managers easier. How can we help enterprises of all sizes to get to the next level of infrastructure automation and benefit from a Web-scale IT approach?

What do you think? To get the discussion started, here are a few characteristics that we believe are key to be successful in this new IT world:

-        Heterogeneous support for all commodity hardware and open source software

-        Capability to build the entire stack from bare metal to the applications layer

-        Modular extensibility of the stack to keep up with ever changing business requirements

-        Simplified deployment with script free configuration

Please chime in! We are looking forward to your input.

The StackIQ Team (@stackiq)

 

Curious to learn more about Web-scale IT? Check out Gartner’s Cameron Haight’s blog. We also highly recommend the research paper “The Long-Term Impact of Web-Scale IT Will Be Dramatic.” (note that this is a paid Gartner publication).

Read More

Topics: hadoop, big data, Cloud, cluster management, business, automation, DevOps, architecture, bare metal, big infrastucture, Web-scale IT, 2014

Get Salted at SaltConf 2014

Posted by Matthias Ankli on Jan 22, 2014 12:58:00 PM

It’s the time of the year to go to beautiful Salt Lake City, UT. Snow-covered mountains in aGreetings from Salt Lake Citygreetings from salt lake city utah resized 600 winter wonderland that offers lots of activities on and off the slopes for all you snow daredevils. But that’s not all my friends; it’s also time for the annual SaltStack conference. It’s THE global user conference for SaltStack customers, partners, developers, and community members. Here at StackIQ, we have always been great supporters of the SaltStack community and are excited to announce that Mason Katz, our CTO, will hold a talk among many other wickedly smart people. Mason’s session will focus on how to augment bare metal clusters with SaltStack. This sounds great, you may say but what are the uses cases for SaltStack combined with StackIQ’s cluster management software? This question and more will be answered during Mason’s talk but here is a little taste of what to expect

StackIQ Cluster Manager, our comprehensive software suite, automates the deployment and management of Big Data, Cloud, Linux and HPC clusters. We take care of the entire software and hardware stack from bare metal all the way to the applications layer. Recently we have expanded on our kickstart installation method to include SaltStack for two use cases:  First, we use SaltStack as a replacement of user management systems such as NIS or LDAP. Second, we use SaltStack to dynamically manage middleware configuration files (e.g., Hadoop). Unique to our use of SaltStack is the integration of our cluster configuration database with SaltStack grains and state files. This talk is focused on our specific use of SaltStack and a walk through our design decisions. We’ll introduce a novel use case where SaltStack is a consumer of configuration information in addition to a manager. Last but not least, we want to hear from the community on how to make StackIQ Cluster Manager and SaltStack work even better together.

Did we get your attention? There is no better way to get salted than at SalfConf with hands-on
labs and training, talks by your peers, SaltStack engineers and developers, big keynotes, lots of hacking and networking, and the BEST (so were we told) snow on earth.

Do you have any questions for the StackIQ crew before or during the conference? Just DM @masonkatz or @stackiq. Be there and stay warm!

The StackIQ Team

 

SaltConf 2014
January 28-30 @ Marriott City Center in Salt Lake City, UT
Augmenting Bare Metal Cluster Builds with SaltStack – Mason Katz, CTO, StackIQ (@masonkatz) 

Photo/artwork credits:
- Greetings from Salt Lake City,
Allposters

Read More

Topics: cluster, bare metal, big infrastucture, saltstack, saltconf

Performance scaling on a big data cluster with Dell and StackIQ

Posted by Matthias Ankli on Aug 29, 2013 12:51:00 PM

Firsthand experience is an important part of the decision-making process for IT professionals who are exploring cloud computing and big data solutions. In response, a large financial
institution worked in collaboration with Dell and StackIQ on a proof-of-concept that compared application performance on a big data cluster. Using our StackIQ Cluster Manager software, the team was able to rapidly configure the servers – leading to more, higher-quality tests than anticipated. 

Read the full article “Evaluating Performance Scaling on a Big Data Cluster” co-authored by our very own Tim McIntire and Greg Bruno in Dell Power Solutions magazine, 2013 issue 3. Or if you happen to be at a Dell office, pick up a print copy in the lobby.

The StackIQ Team

 

 

Read More

Topics: big data, cluster, Cloud, cluster management, case study

On-Premise Manufacturing

Posted by Greg Bruno on Jun 19, 2013 1:29:00 PM

Recently, we were in a partner meeting and the topic of "brownfield deployment" came up. Just to make sure we have our terms defined, a brownfield deployment is where a product is deployed onto an existing cluster, that is, the existing cluster already has a base OS installed and configured across all its nodes (a "ping and a prompt"). We were discussing this topic because our product excels at "greenfield deployments", that is, our product is a bare-metal installer that deploys, configures, optimizes and manages the entire software stack -- the base OS, middleware services and the application(s). In this discussion, the partner said, "but that doesn't fit into our customers’ current enterprise processes" of installing a base OS first, then handing servers off to another team for application deployment and management.

Actually, it does, if you look in the right place.

Today, there are several computer systems that don't fit into the “current enterprise process” of provisioning servers as a stand-alone process: Oracle and Teradata, to name two. These systems are constructed off-site at a manufacturing facility -- the hardware is assembled, then the entire software stack is installed, configured and optimized. Then the fully-assembled system is shipped to the enterprise customer as an appliance that is then integrated into the current set of on-premise computer systems.

As we've all read, we are at the beginning of a revolution surrounding database systems -- proprietary systems are being replaced with commodity "do it yourself" systems. We saw this same revolution in the high-performance computing space in the early 2000s. Proprietary systems created by Cray and IBM were replaced by commodity x86 clusters. This was fueled by significant hardware cost savings, but now the end user was faced with the heavy burden of deploying, configuring, optimizing and managing the entire software stack -- the job that Cray and IBM performed for their turnkey systems. That's why myself and two other parallel systems developers created the Rocks Cluster Distribution to "make clusters easy" by automating what was once a heavy burden (the Rocks Cluster Distribution is the core of StackIQ's enterprise software line).

Fast forward to today, StackIQ is making clusters easy in the enterprise. We are seeing Oracle and Teradata systems being replaced by commodity clusters. And we are seeing enterprises struggle with the heavy burden of managing cluster-aware middleware and applications, just like we saw the high-performance computing community struggle 10 years ago.

The enterprise values robust software systems, which is why Oracle and Teradata were so successful in the past. Oracle's and Teradata's manufacturing process ensured robustness and stability for every system they produced. This is still of critical importance for enterprise applications, thus StackIQ has taken cues from Oracle and Teradata (I worked at Teradata prior to co-founding the Rocks project) and built an enterprise software management product that ensures the same robustness and stability for the entire software stack that enterprise users have come to expect; we do this via on-premise manufacturing. Our software transforms on-site commodity clusters into enterprise appliances.

Once commodity clusters are managed as appliances, these systems fit into every “current enterprise process,” your racks of Oracle or Teradata systems are proof.

Greg Bruno, PhD

@itsdrbruno


Read More

Why Automation is the Secret Ingredient for Big Data Clusters

Posted by Greg Bruno on Apr 30, 2013 1:38:00 PM

automatic definitionThe word “automatic” has been around since the 1500’s, but really came to the fore in 1939. That’s when the New York World’s Fair sparked everyone’s imagination with visions of technology that promised to solve all of our problems through automation. Recently, while working with one of our customers, I was reminded how automation can still surprise people. Let me tell you what I mean.

Prove It

A large credit card company recently asked us to participate in a “proof-of-concept” for their big data project. As a startup, we are always thrilled when one of the big boys wants to try out our wares, so we jumped at the opportunity.

When we arrived on site in their data center, they assigned a half-dozen machines for us to use. One would become the StackIQ Cluster Manager, and the other 5 would become cluster nodes running Hadoop. We are used to building clusters of all sizes using our software, and knew that a small, straightforward installation like this one would be a cake walk. We set about our task.

We set up a few parameters for the cluster, and launched the StackIQ Cluster Manager. It was soon up and running without a hitch, as expected.

Next, we used the Cluster Manager to install the cluster machines. Twenty minutes later, all 5 backend machines are up and running Hadoop services. Smooth. No problem. Expected.

It’s A Trap!

That’s when my colleague and I noticed that the customer’s IT people are whispering to each other, and we started to wonder if we’d done something wrong. We checked our screens, and found that cluster was indeed up and running — ready to accept Map/Reduce jobs.

So we took a deep breath and walked over to the gathered whisperers and asked if there was a problem. One of them asked in a hushed voice, “Um, how’d you guys do that?”

“Do what?” we answered.

“Bring up that one machine?” he said, pointing at one of the cluster servers.

After we explained that we hadn’t done anything special, we just let our Cluster Manager do its thing, the customer confessed, "We’ve been struggling to configure that machine for over 2 weeks now and haven’t been able to get it to install. There seemed to be something wrong with the configuration of the disk controller, but we haven’t been able to fix it.”

We smiled.

That’s the power of true automation. That’s what we designed our software to do. That’s what makes us very proud of the software we build. It takes the headaches out of setting up clustered infrastructure of any size by automating nearly everything — including configuring those pesky disk controllers.

What was a major problem for our customer — one they hadn’t been able to solve in weeks — wasn’t even a bump in the road for our cluster manager. It found the controller, configured it, and moved on to its next task. Smooth. No problem. Expected.

Less Tedium

It can take as many as 80 manual steps to correctly configure a disk controller for use in a Big Data cluster, and clusters have a lot of disks — and controllers. We knew that we had to automate the configuration of all those disks to help cluster operators build their clusters efficiently. Automating the procedure dramatically reduces the time it takes to put a cluster into production.

Here’s how we do it. On first installation of a server, our software interacts with the disk controller to optimally configure it based on the node’s intended role. For example, if the machine is a data node, the disk controller will be configured in “JBOD mode” with each disk configured as a RAID 0. However, if the machine is going to be a Cassandra data node, the data disks will be automatically configured as a RAID 10. This all happens automatically — no manual steps — ensuring that all cluster nodes are optimally configured from the start.

The goal is a smooth configuration process. It’s just a bonus when we get to surprise and delight a customer who sees their cluster up and running after struggling for weeks on their own trying to solve a stubborn configuration problem.

Smooth. No problem. Expected.

 

Greg Bruno

@itsdrbruno

Read More

Topics: big data, cluster management, automation

Bare Naked Servers for Big Data!

Posted by Lionel Gibbons on Apr 11, 2013 9:43:00 AM

bare metal

Have you ever gotten so immersed in a topic that you forgot that others might not be? For instance, you may have lapsed into jargon from your workplace while at a party and been met with that look that says, “Umm, I think I’ll go refresh my cocktail now.” I know I have. It turns out people have better things to do with their time than study whatever particular topic you think about all day long.

The same thing can happen when your company communicates with people. I don’t know what business you’re in, but I’ll bet the way you and your colleagues talk about it would baffle the uninitiated. I was recently reminded of the problems insider-speak can create as we were gearing up to start a new proof-of-concept project with a prospective customer.

Here’s what happened.

At StackIQ, we make software that builds clusters for big data from bare metal. By “bare metal” we mean machines that have no software on them at all. We use that term in our presentations, sales pitches, web site, and marketing collateral.

The reason our software provisions systems from bare metal stems from the philosophy our founders developed during their years building and maintaining clusters. They discovered that if you allow operators to apply patches and change configuration settings incrementally to various machines in the cluster, you eventually wind up with a system in an unknown state. That makes it very difficult to troubleshoot problems. Which machine is running which version of the OS? Which ones are at the current patch level? Which have yet to be updated? Were all the change logs updated — every time? Who knows?

The only way to know for sure what is running on all of the machines in your cluster, is to install each of them from scratch (aka bare metal) using a known-good source. So we developed a system that does just that, and does it fast.

OK, back to our confused customer. We had given them our sales pitch, and they agreed to try out our software in their labs. When it came time to allocate some servers to the test, they asked us which operating system we wanted them to install. We explained (again) that it didn’t matter, since our software would install everything “from bare metal.” To which they responded, “Oh, OK. So we’ll leave the cluster nodes empty, and just install Linux on the management node.” “No need,” we explained,“ we will install the entire cluster from bare metal — including the management node. There’s no need for you to install any software at all.”

Anyway, we got it all straightened out, and the customer gave us a set of bare machines to run our tests on.

Why was our customer confused? It wasn’t their fault. What we do is decidedly different from what others in our space do. Our competitors require that an OS and other software be in place before they begin their installation. They don’t operate from our “clean slate” philosophy. What’s more, the term “bare metal” is often used to mean something different in the IT community. For example in the cloud computing space, “bare metal” is used to describe a software stack that is running directly on the hardware, and not in a virtual machine. Even wikipedia redirects a search for bare metal to an article on “bare machine.”

I took this incident as a reminder that we should never assume what others know. Everyone’s experience is different, and that experience gives them a unique perspective. So whether you’re a marketing professional, a sales professional, or a technologist, it’s always a good idea to check that people have understood your message, and adjust your language to make yourself clear.

Hmmmm, maybe I should go run a find/replace operation on our product information to replace “bare metal” with “bare machine”…

photo credit: JD Hancock via photopin cc

Read More

Topics: hadoop cluster, big data, cluster management

GigaOM Interviews StackIQ Executive about Big Data

Posted by Lionel Gibbons on Apr 9, 2013 9:25:00 AM

GigaOM caught up with StackIQ executive, Tom Melzl, during the Structure Data conference to get an update on the company. In the interview, Tom explains why cluster management is crucial to any successful big data project, and what differentiates StackIQ from its competitors. He also gives us a peek at the technology areas the company is focused on as they develop innovations for the future. 

 

 GigaOM talks to StackIQ's Tom Melzl  (3:50)

 

 

Read More

Topics: hadoop, big data, interview, tom melzl

Big Data Can Mean Big Money for Retailers

Posted by Lionel Gibbons on Mar 12, 2013 11:51:00 AM

lets go shopping m

Have you heard this story? A couple of MBA students were scoping out the local 24-hour convenience store and noticed an end cap that featured an odd pairing of products: diapers and beer. Huh? Turns out that someone crunched their customer behavior data deeply enough to figure out that when a bleary-eyed new father stumbles into the store late at night, diapers or beer were probably what he was after. By displaying these prominently on the front end of the aisles, the store was able to make the late-night shopper’s quarry easy to find.

If beer goes with diapers the way cookies go with milk, imagine what insights big data could bring to your business. Retail is right in the sweet spot to benefit most from big data projects. Some large retail organizations generate terabytes of data every minute. Inventory systems, loyalty cards, and sales transactions reveals exactly what was sold, when it was sold and what other items were rung up in the same purchase.

So Much Data, So Many Ways to Use it

What’s happening to that data now? Much of it gets stored, and later used for financial analyses of various sorts. Increasingly, other departments are starting to dip into the data for their own purposes.

Human Resources departments are using big data to determine how many sales associates and other personnel to have on hand, and when. Hiring and staffing patterns will become more precise, contributing to the bottom line.

Buyers are leveraging the data because with suppliers. The result? Fewer returns, fewer overstocks, fewer costly mistakes like all those leftover candy canes in the back of the shop months after the holiday season has come and gone.

Shelf placement is usually done by suppliers, but retailers can use the results of their big data analysis to help optimize that placement. Maybe those oversized boxes of laundry detergent ought to be on the middle shelf instead of the bottom. Better data means better sales and both buyers and suppliers will like that.

Big data tools can let help your marketing staff do better, faster research. In one store, they’ll put batteries on the end cap closest to the door. At another store that’s where the bathroom tissue goes. Who’s right? Who’s wrong? What about the beer and diapers? With the right data, you can take the guesswork out of it. And while we’re at it, take a look at which coupons are working best and which ones never move anything? The possibilities for tweaking are nearly endless.

So, What Do You Need to Make it Happen?

Most retailers are choosing open source Apache Hadoop software running on low cost, commodity hardware for their big data projects.

Setting up and operating a big data cluster can be an intimidating proposition for IT departments used to working with more traditional enterprise data center resources such as email, web, and database servers. Big data clusters are different animals. Fortunately, the market has responded by providing good deployment and management tools. With the right tools, any IT department can deploy and manage big data clusters with confidence, even if they’re never done it before.

Another benefit of working with a good vendor is that they are experts in the art of cluster management. You can draw on their years of experience, building and running clusters of all sizes. Chances are pretty good they’ve already seen and solved any problem you run across.

So, are you ready to take the big data plunge? Start out on the right foot, and pretty soon you’ll move from being a big data beginner to petabyte-crunching pioneer.

photo credit: x-ray delta one via photopin cc
Read More

Topics: big data, business, Retail

Subscribe via E-mail

Follow StackIQ

Resources

White Paper - Boosting Retail Revenue and Efficiency with Big Data Analytics

    Download      

GigaOM Research report - Scaling Hadoop Clusters: The Role of Cluster Management:

    Download    

White Paper - The StackIQ Apache Hadoop Reference Architecture:

    Download