Preparing for NoSQL
From MIKE2.0 Methodology
-> You are here: Preparing for NoSQL
Products are being released that fall into the NoSQL category weekly. Most of them are open source and quickly provisioned in the cloud. Increasingly, despite enterprise information governance standards, developers are becoming more comfortable with this model. Consider just Dynamo, Voldemort, Tokyo Cabinet, KAI, Riak, SimpleDB, CouchDB, Cassandra, Berkeley DB, MemCacheDB, Redis, MongoDB, TerraStore, Scalaris, BigTable, HyperTable, HBase, Vertica, Greenplum, InfiniDB, Aster Data, and InfoBright.
But is business management becoming more comfortable with open source NoSQL? How about information technology management? Although several developers bringing in the wares are in IT, IT management mostly still only marginally understands these technologies, let alone the fact that these technologies are coming into the environment. Furthermore, as business groups becoming increasingly impatient with IT, their developers are moving forward with these initiatives. I’ve seen several “this is not in production” production NoSQL running in Fortune 100 shops over the past year. This keeps them not only off IT executive management radars, but also off a lot of big analyst radars so we all know less than what we need to be successful with NoSQL. However, NoSQL cannot stay away from IT management’s radar forever.
The salient issues IT management will need to confront in NoSQL implementations include:
1. The developers learn the skills on the fly and just-in-time
2. Lack of ACID compliance, which is usually applicable, means delays at best to transaction consistency
3. A dominant view of these tools could be that, despite the release number, they exhibit characteristics of sub-version-1.0 tools.
4. The fast nature of unburdened projects could lend to bad methodology that underestimates risk, diligence, robustness and the like
5. Lack of consideration for mandatory regulations the shop may be operating under contractually or from government bodies
6. Developers are not usually the same as those who have developed robust enterprise-ready applications in the past
7. The projects are often focused on data and not on business results
8. Schema-less (or schema-lite) data models, among other NoSQL constructs, gives immense flexibility to the systems and could require a more dynamic view of “production”
I’m not taking sides here as there is value in NoSQL and in well-done agile projects for that matter. My advice is to prepare for the day of reckoning when the inevitable happens. In my experience, these projects mostly go legitimate the hard way when they predictably reach out to other systems for integration and discover their own architectural flaws. For example, a Hadoop project could find it needs data beyond the single source it started with.
Some advice on preparing NoSQL for integration:
1. Develop a better understanding of full costs. Even though there are substantial data storage savings compared to SQL counterpart projects, costs still need to be managed in NoSQL environments. I know of projects now that, if not careful, could be replacing those storage costs with other costs like a balloon squeezed on one side that expands on the other side.
2. Develop an ROI model for the project. Don’t put the cart before the horse. There is a reason the organization hasn’t managed this data to-date. If that reason is technology cost and NoSQL solves it, then green light. However, if the data isn’t truly needed for anything – or the organization isn’t getting ready to create business benefit from its management, that should drive caution. I’m not saying don’t take risks or lead the way, but having legitimate answers to the ROI question is only prudent.
3. Ease into it. With barriers removed to data capture, it can easily be thought to start with egregious problems and petabytes of data. Petabytes in any environment is difficult. As you are learning the technology and its bounds, as well as the organization’s interests, deliver time-blocked projects with business impact.
4. Governance. This is an ugly word to most NoSQL projects. As a matter of fact, despite what may seem like an incorrect apportionment of responsibilities, I encourage existing governance groups to redesign themselves to be relevant on this new world. A dogmatic, bureaucratic approach is not only not going to work for NoSQL, it is not going to work for much longer regardless as companies are increasingly becoming deliverable-based. That being said, the NoSQL team also bears responsibility for understanding governance’s role and the standards it sets for all projects. There is no excuse for developing in violation of security requirements for example.
5. Consider technical integration points with the “legacy” environment. It doesn’t mean you have to actually do any integration on day one, but things like the data grain and naming standards in the models and the tool sets used in utility operations could be selected based on eventual integration.
Organizations and NoSQL need to come together and the operative word is governance – flexible and accommodating, yet business-first and principled governance.
Wiki asset search