How many of you have a backup horror story to tell? Remember when Pixar almost lost 6 months work on Toy Story 2?
The world is full of backup horror stories. With the amount of data we are generating, coupled with rules and regulations on retention, professionals tasked with backing up data have a constant battle on their hands. They need to be able to backup more data faster and more importantly they need to be confident they can restore that data rapidly in case of emergency.
The good news is that not all data is the same and you can deliniate policies for backup based on the importance of data. The bad news is you likely do not have a good way of understanding where that data is and if it is protected appropriately. You may have multiple copies of some files spread across multiple domains on your network. You may have unimportant files that have not been accessed in over 10 years consuming valuable space on backup devices, when those files should have been archived or deleted. With the mushrooming data sets you are dealing with how can you keep all of this under control and improve your backup.
The answer has to lie in a data catalog that can be aligned with data management, backup and archiving policy. A good data catalog should :-
Give you visibility into all of your data.
Provide you with actionable utilization and data efficiency information.
Help secure your data by enabling the correct access controls to be inplemented and monitored across your datasets and users.
Help ensure compliance by enabling policies to be set for data retention, and destruction
Improve backup efficiency and enable valuable IT resources to be redirected to projects that drive the business forward.
A data catalog just makes sense. Most IT professionals have some king of scientific background, and catalogs are how we make sense of the world in science. The Periodic Table is a simple catalog of the elements and their relative properties. In Biology with have a Taxonomy that classifies millions of species. With this structure everything gets simpler to understand and manipulate. The same is true for data.
I have recently been writing a series of blogs on catalogging and managing the data deluge for Catalogic Software. My lastest is on the 'Top 5 reasons for building a data Catalog."
Backup and archiving are constant improvement projects for all IT organizations. If you are looking at the next steps in how to improve your backup I encourage you to look at building an Enterprise Data Catalog. Maybe breaking your current backup can be a good thing.
Technology is changing every aspect of our lives, and one area rich in technology innovation is healthcare and the battle against our most feared diseases like cancer.
I was very interested in a few groundbreaking research developments I have seen in the last few weeks about cancer. The first was under the attention grabbing headline "Girl whose cancer was cured by HIV." What! AIDS vs. cancer. That sounds a bit like Predator vs. Alien. In fact it is an amazing story about how HIV may be modified to attack cancer cells.
The second was around epigenetics and developing personalized cancer treatment. This involves looking at the specific DNA of a patient and determining the right drug combination for their specific case. The computational power required is one of the key reasons why this is not yet a broad treatment possibility for cancer patients. If ever the term "Big Data" was apt, it is here. Maybe it should be "Big C Data"!
The third story was about using electronic noses to identify cancer in its early stages. Early detection is one of the best ways of improving the outcome for cancer patients and it has long been thought that some animals can actually smell cancer. Scientists are exploring this to create better early detection mechanisms.
Just 40 years ago cancer was a disease rarely spoken about. It was the Big C. The disease that shall not be named. A lot has changed in 40 years. Now we have comedy shows on HBO about cancer, "The Big C", a movie comedy about cancer, "50:50" and hardly anyone does not know someone who is fighting cancer or has passed because of cancer. There is even a game for your cell phone called Triton, where the aim is to thwart cancer and proceeds go to cancer research. I venture to suggest that almost everyone who reads this blog has volunteered for a charity walk for cancer or provided support for a cancer charity in some other way. We even have slogans like "Save the Tatas."
Some of the areas where technology is helping fight cancer are obvious such as new technology that helps detect and treat cancer, or fueling groundbreaking research to achieve breakthrough new treatments, or sites like WebMD that offer advice that may help patients spot the early warning signs. However, maybe the biggest tool in the arsenal against cancer is social media.
Most people have probably looked at social media sites to understand an illness at some point, and come away wishing they had not. This is because like any application of social media, you have to take the good with the bad. There are always horror stories out there on the internet. However I believe the good far outweighs the bad. Let's look at how social media can impact a patient with cancer. I should mention here that every cancer situation and patient is different, and whatever path they choose is the right one for them.
A cancer diagnosis is a terrible thing to have to hear and in the aftermath of being told you have cancer you have to decide who you are going to tell and how. Close friends and family are probably going to get told face to face or via the phone, but what about everyone else? Here is the first way in which social media helps:
You can use social media to inform all your acquaintances that you have been diagnosed with cancer in a single post. Facebook can be a key tool here. This may seem a little cold to some people but the alternative is often distressing. It may not be for everybody but cancer is a life changing event like getting married or having a baby, and how many people do we know who have chosen to communicate those events broadly on Facebook.
Once you are undergoing treatment for cancer people are going to want to know how they can help and how you are progressing. It is easy to tell some folks but our friends may be half a world away and it is not going to be easy keeping them up to date if you have to do it individually. This is the second way in which social media can help:
Blogs and dedicated social media sites keep family and friends informed. If you are reasonably savvy you can start your own blog (Wordpress, Typepad etc.) about your cancer journey and have your friends sign up to recieve updates. For those less technically savvy there are simpler alternatives provided by sites such as CaringBridge or Lotsahelpinghands. CaringBridge connects people during health events. Currently I am following three friends Caringbridge Journals and it is an excellent way to globally inform anyone who is interested how you are coping. CaringBridge lives on donations from those using the site to keep the service going. Lotsahelpinghands offers support for both patient and caregiver and extends into volunteering in local community events.
Friends and family of cancer patients want to help and social media makes that possible in a number of ways.
Fundraising is made easier via donation sites. Sites like JustGiving, GiveForward, Youcaring and GoFundMe enable easy donations toward a charity or directly to a family or person in need. As an example my niece recently climbed Mt Kilimanjaro for Cancer Research UK and was able to use JustGiving to collect charitable donations from around the world and raise over $10,000. Some of these sites are designed to help get money to recognized charities, but many are more personal as they can be directed to a specific person in need.
Scheduling help is easier. Sites such as LotsaHelpingHands or more specialized sites like MealTrain also enable you to set up schedules to help patients by delivering meals etc. On Mealtrain you simply sign up for a meal time and what you're committing to, and the calendaring ensures that there is no overlap with others helping care for the patient. MealTrain also offers sample menus for different conditions.
Finding people with a similar diagnosis to discuss experiences with is easier. Many cancer patients feel alone with their diagnosis and would like to speak to people who have been through a similar experience.This is an area that can be a bit of a minefield on the internet but if you find the right sites it can me immeasurably helpful. Weather you just want to chat to someone going through a similar experience or look for information on your specific condition there are many sites available. Most medical sites offers chat rooms and most medical conditions typically have dedicated support groups. From a cancer perspective groups like the LiveStrong and Susan G. Komen get a lot of attention but if you have a rarer cancer with less funding and visibility you can use the specific support group for that cancer. A couple of examples are: ECAN -Esophageal Cancer, PANCAN - Pancreatic Cancer, Thyca - Thyroid Cancer.
All of these are helpful tools for the cancer patient and show how social media has become one of the greatest help tools of the last 10 years. However maybe the greatest impact of social media on cancer is visibility. The pace of innovation in cancer treatments is increasing but so are the costs. As more people see news about cancer and the people affected and discussion opens up, the more funding there is into research. And the more research there is the more incredible stories of scientific breakthroughs we hear, like those I started this blog with. Ultimately this pace of research will drive the big breakthroughs that help us detect cancer sooner and treat it more effectively, improving the quality of life for all cancer patients and their families.
Finally more oncologists and cancer researchers should understand the power of social media tools like twitter to get information and news out to patients. While some are it is still not a common practice. There is an excellent article about this by Dr. Michael A. Thompson, MD, PhD on The Cancer Network website and a more comprehensive guide from Dr. Thompson to social media and oncology is here.
Social Media is at the heart of driving these breakthroughs because it had made cancer a more open disease. As more and more patients, caregivers and medical personnel enbrace social media the pace of breakthroughs in cancer research and treatment will accelerate. As Martha Stewart would say, "That's a good thing!"
Some links on how social media has had an impact in the fight against Cancer:-
#nomakeupselfie - a viral message on facebook with women taking pictures of themselves with no makeup raised $12M in just 6 days for Cancer research UK.
As a storage marketer in a vendor, there is always a lot of pressure to determine the category you are in, so you can clearly lay out your value proposition to customers. It also helps you define your competitive set. If possible you want to define a category because being identified as a first mover and controlling a category definition is gold dust. Well in storage we are blessed with a plethora of category definitions and lately it seems like everyone wants to be a part of them all.
As I mentioned in a previous blog I was a participant in a recent interesting CrowdChat on the emerging category of Server SANs. You can find my original post here. During the debate it was clear that there are a lot of interesting takes on the definition of Server SANs and those bleed into the broad definition of Software Defined Storage and the definitions of Converged Infrastructure, Converged Storage, and vSANs. Wikibon has taken a cut at defining the Server SAN. I am not sure I agree with it all, but to move the discussion along I thought I would expand on their initial take and broaden the discussion. I am as interested in a set of definitions that can be understood as anyone else. As a Zoologist I know the power of having a good taxonomy.
Let’s first take a cut at some definitions for some of the most used terms in current vendor marketing materials. (These are my takes, you can find others on the web and I reference some at the end of the article)
Software Defined Storage – a Storage system created from software that runs on an industry standard server. To be truly software defined all of the storage functions and data handling (e.g., RAID, compression, deduplication) need to be part of the software stack. To be truly software defined the platform should be delivered as software and not as an appliance.
vSAN – A highly available storage system that can be built from running storage software as a virtual machine appliance.
Converged Storage – A Storage Appliance that is built on an Industry Standard Server base. There are really two separate categories within this. 1) Storage that uses an unchanged Industry Standard Server base and standard Ethernet networking with all storage data handling done in software. 2) Traditional Storage platforms that use hardware based RAID controllers or other offload engines for elements of the storage data processing, or uses a less accepted network protocol (e.g. serial ATA). There are very few traditional storage platforms left not using X-86 processors as the underlying technology. Equalogic and the NetApp products acquired from LSI come to mind as examples of non-converged storage.
Converged Infrastructure – A standardized Infrastructure with a common management framework that crosses, Server, Storage and Networking.
There are other technologies in play that can be confused with this so let me give my simple attempt definitions in the table below. I do not claim these to be definitive (I have not found a generally accepted set of definitions). I am using this as a way to differentiate technologies and facilitate discussion.
Server SAN - No this is not a Japanese Server. It is a single product, that delivers both server and SAN functionality. That being, it can run the apps that you run on your servers today in virtual machines, and it also provides a highly available storage backend that delivers typical storage functionality.
Hybrid Storage – A storage appliance that blends hard disk drives and solid state into a single solution. Note:-Most storage solutions can be called hybrid by this definition. “True” hybrid vendorswill state that this capability should be designed from the ground up for efficiency and that the movement of data between HDD and SDD should be based on caching. These purpose built hybrid systems often cannot be purchased without SSD.
You will notice from these definitions that there is certainly overlap.
Figure 1 Overlap of Storage system definitions.
Note:-Traditional Storage systems not within the above definitions such as Dell Equalogic are not included. Note:-Hybrid storage is not included as a category in this map although a popular term. All of the above categories offer hybrid storage solutions.
From the diagram you can see that a vSAN is also a Server SAN and Software Defined Storage. There are many other overlaps in the definitions. Let’s look at some examples of these technologies and break these categories down further:-
** vSANs can also credibly be placed in the converged Infrastructure bucket.
Examples are not exhaustive. They are meant to be illustrative.
Now with examples placed against the definitions we can see that a number of products fit into multiple categories. This is not ideal when building taxonomy. In Biology, an Ant cannot be an Arthropod and a Chordate, because those are two distinct Phyla. In technical marketing we seem unable to even agree what Domain some technologies are in. It is the world in which we live. As a vendor claiming to be “Software Defined Storage” or “Converged Infrastructure” or placing yourself in any other category becomes an important tool to help highlight your value and competitive set to a target customer. As we can see at the moment the proliferation of definitions does not help provide customers with the clarity they want and deserve.
I don’t claim to have an answer. There may not be one. However a healthy debate on this is needed. I am interested in your thoughts. If we humans can work out a system that applies to millions of species maybe we can find one that applies more simply to technology and in this case storage.
Update (03/06/2014) -VMware has just annjounced availability of their vSAN and claim it to be a vSAN, software defined storage , and converged infrastructure. All are true. It will be interesting to see if this enhanced oush by VMware will accelerate adoption of more vSANs and how the market plays out between the vSAN vendors who have been around for a while and the VMware implementation which is built into the hypervisor. Have the current vSAN vendors missed their opportunity now VMware is fully in the market or will this announcement float all boats? Time will tell.
I participated in an interesting session on CrowdChat today on the topic of server and storage convergence. The session was hosted by @wikibon and @deepstoragenet. Convergence is a very expansive subject, (I coined the term Converged Infrastructure at HP back in 2008), and I will discuss it more in future blog posts.
When things like servers and SANs DIVERGE they create choice. Which vendor will I use for which area? I can pick and choose from best of breed solutions but I have to do integration work. When they CONVERGE they force a choice. "I am going to place my eggs in this one basket". To buy into the forced choice there must be compelling benefit.
Some interesting focal points came out of the conversation.
What is software defined storage? General consensus is that the term has been overused by too many different architectural implementations.
Is the new Storage array storage software that runs on X-86 servers? In general this ship has sailed. There are very few storage implementations that are not software based, (Equalogic and the LSI Engenio products sold to NetApp come to mind), and increasingly hardware assisted functions like RAID and deduplication are also being built into the software stack.
Will ServerSANs prevail? This is a true combination of server and storage. There are many unanswered questions in this category. One common question was the impact of running the storage stack on the same processors as the server stack. How many fewer VMs can you host and what is the impact of that. A more pressing question might be, how will customers accept this new paradigm? Corporate IT is built around silos and people and process issues have to be broken down for the technology to find broad acceptance. Customers are still in wait and see mode here. They are risk averse and are not prone to putting all their eggs in that single basket.
How will latency be handled effectively for scale out solutions. here there was discussion on if Infiniband would find broad acceptance in corporate datacenters as a storage interconnect outside of products such as Oracle Exadata.
You can track more of the ServerSan conversation using twitter with the hashtag #serversan or go and full the full content at https://www.crowdchat.net/serversan
On Janaury 27th 2014, Catalogic Software was born. Catalogic was previously SyncsortDB and can now count itself amoungst the myriad of data protection vendors in the market. Catalogics unique take is that by using existing equipment (NetApp as the backup Server), and by managing your data better you can both lower costs, and improve recovery point and recovery time objectives.
Apart from a unique approach to backup with Catalogic DPX they also provide Catalogic ECX for Cataloging of data. This is an important element in any data reduction and information lifecycle management strategy.
Catalogic was kind enough to ask me to write a guest blog for them and you can find my first entry here.
For an introduction to Catalogic you can also listen to an interview with Catalogic CEO Flavio Santoni talking to truth in IT.
As 2014 began there was a fairly silent acquisition made by SGI that could prove to be significant in the future. SGI picked up the technology and engineering team of Starboard Storage System.
Full disclosure, I used to work for Starboard and have great respect for the technology they developed. What is important now though is what SGI will do with the technology. I have no direct knowledge and can only speculate on the direction, but before I do that let's take a look at SGI today to see why the Starboard techology interested them.
SGI was of course Silicon Graphics Inc and was founded in 1982. The pioneered alot of graphics clustering and file system technology and produced high performance computer systems. I worked with SGI as a partner back in the late 80's when I was at Compaq Computer and both companies were involved in an Advanced Computing Environment (ACE) which attempted do define a new standard built around MIPS processors, EISA architecture, Windows NT and a merge of SCO and Digital Unix. The ACE initiative finally collapsed as SGI acquired MIPS, DEC released Alpha and Intel substantially improved the performance of their processors.
Silicon Graphics eventually went under due to competition from industry standard computing systems and its assets were acquired by Rackable Systems who then used the SGI name.
Rackable brought servers to the game and SGI brought engineering prowess for the high end in addition to significant IP in file systems and clustering, but neither really had much storage technology. Today SGI's InfiniteStorage and other storage systems consists mostly of OEM platforms from NetApp and DDN along with ZFS based file storage.
SGI is positioned into high performance computing, archiving and big data and while these OEM storage platforms fill a gap they do not command rich margins. Also SGI is the engineering prowess behind the XFS file system, (Which is open source), so it is strange they would be using ZFS. The reason they do this though is that ZFS provides elements like disk pooling and snapshots which they would have to reinvent if they used XFS.
So how does acquiring Starboard Storage help SGI? The following are the key factors.
Starboard had developed a full hybrid storage platform that exploits SSD and uses sophisticated caching algorithms to enable the development of lower cost but higher performance storage platforms. Indeed Starboard won Best of Show at the Flash Memory Summit in Califormia in 2013. Hybrid Storage is a model that has found success in the market with Nimble (who recently had an IPO), Tintri and Tegile. Starboard technology has some potential advantages because unlike Nimble and Tintri it spans both file and block storage. The Starboard caching algoritms could also me leveraged in the future with the recently acquired assets from FileTek.
Starboard brings snapshots and dynamic disk pooling technology to SGI. The Starboard capacity free snapshots are nice but the disk pooling technology is potentially key to SGI. With the fixed archiving initiatives thay are looking for large scale data pools. Starboard developed from scratch a disk pooling methodology that can scale to petabytes but with minimal overhead and fast rebuild time. All data protection is also in software so no RAID controller is needed. This in combination with technology SGI got from Copan could enable the development of some very powerful archiving platforms. They technology can also be extended to solid state only systems.
The Starboard Storage stack is software only. This enables SGI to leverage their own hardware for future platforms and also enables future Software Defined Storage models.
Starboard's file system of choice was XFS. With SGI understanding of XFS and their developement of Clustered XFS this provides a unique opportunity to expolit IP that was not reaping significant benefits in the market for SGI. The fact that both Starboard and SGI have deep XFS knowledge and that Starboard has implemented an entire storage stack around XFS may be one of the key reasons why SGI acquired the technology. SGI may now be able to monetise their XFS investments.
Only SGI knows what the next steps will be but the Starboard technology offers them a number of interesting options. With the acquisition SGI now has the IP to build a competitive position in both servers and storage for its key markets, and potentially improve both revenue and margin.
Recent Comments