As you plan your big data strategy for next year, keep these seven goals in mind.
Big data goals- In 2021, corporate big data leaders will be looking to improve data quality and turnaround of big data projects, as well as performance in meeting business objectives. While 2020 hasn’t been a normal year for anyone, you still have to plan for the future and get ready for what may come. Here are seven key big data areas of focus for 2021.
1. Manage data better
Big data continues to enter corporate networks at torrential rates, with the amount of poor data that companies obtain or use costing the US economy an estimated $3.1 trillion annually. More effort needs to be made to screen data as it comes in, and to properly clean and prepare data before it is added to corporate data repositories.
At IBM Research Switzerland, artificial intelligence (AI) and machine learning assisted researchers in plowing through reams of scientific papers and journals in a search for relevant information pertaining to a molecular drug design. Researchers recognized that much of the worldwide information the AI would be reviewing would have no relevance to the problem they were trying to address. The company made a decision to eliminate importing data from non-relevant sources upfront. This saved hours of AI time, gave the researchers a high relevant set of data, and eliminated data storage waste.
Once the data passes incoming criteria, it should also be cleaned and properly prepared before it is uploaded into a data repository. This means checking for incomplete, duplicate, and inaccurate data, and also normalizing data so it can be blended with other source data for analytics.
2. Speed and monitor the process
By now, most organizations are well underway with an iterative, DevOps-style development approach for big data and analytics. Now it’s time to formalize the process so users and IT/data science know when a big data analytics model is mature enough to be placed into and maintained in production.
The benchmark for corporate readiness is that big data analytics results must reach a threshold of 95% accuracy and must consistently deliver this level of performance. Since business and outside conditions change over time, it’s possible that a big data application in production can start falling below 95% accuracy.
IT and data science should establish a maintenance policy that remeasures apps for accuracy each year to assure that the apps are still delivering accurate results.