The ultimate goal of the set of technologies collectively referred to as big data is to provide the means to improve the accuracy of the decisions and to decrease the latency of making them. In competitive business environments, applying big data is not a choice; it is a necessity to sustain the organisation and achieve growth. In the healthcare area, it is a moral obligation, reducing uncertainty and delivering optimised treatments. In short, big data is the means of reducing the odds.
Big data has been primarily enabled by the ability to collect and consume census-scale data sets with respect to some problem space. Before the advent of big data, analysts relied on sampling from a population and, depending on the size of the samples, were able to come up with some conclusion bound with some level of uncertainty.
At the base of the big data stack is the physical layer — the very large cluster of racks in data centres operated by numerous cloud service providers (CSP) accessible via broad network access. Aside from analytics, the ‘big’ attribute of big data also has meaning in information technology: above a certain threshold in terms of size (tera, petabytes and beyond) data sets cannot be handled with traditional data storage technologies. The prevailing solution in that area is the open-source Hadoop Distributed File System (HDFS) technology whose specifications are maintained by the non-profit Apache Software Foundation.
From the perspective of a new or prospective big data user, the good news is that there is no need to become an expert in HDFS technology to take advantage of the power of big data. The other great advantage in contemplating big data usage is that it does not require capital expenditure. Instead, it comes from the operational side of the organisation and frees up the ‘big’ headache of procuring hardware. No more is it necessary to allocate a significant amount of time to plan ahead for the right IT infrastructure, hoping to target the right size. However, some planning remains necessary.