Archive for the ‘Web Content Management’ Category
In terms of trends, few are as clear and popular as flash storage. While other technology trends might be more visible among the general public (think the explosion of mobile devices), the rise of flash storage among enterprises of all sizes has the potential to make just as big of an impact in the world, even if it happens beneath the surface. There’s little question that the trend is growing and looks to continue over the next few years, but the real question revolves around flash storage and the other mainstream storage option: hard disk drives (HDD). While HDD remains more widely used, flash storage is quickly gaining ground. The question then becomes, how long do we have to wait before flash storage not only overtakes hard drives but becomes the only game in town? A careful analysis reveals some intriguing answers and possibilities for the future, but one that emphasizes a number of obstacles that still need to be overcome.
First, it’s important to look at why flash storage has become so popular in the first place. One of the main selling points of flash storage or solid-state drives (SSD) is its speed. Compared to hard drives, flash storage has much faster processing power. This is achieved by storing data on rewritable memory cells, which doesn’t require moving parts like hard disk drives and their rotating disks (this also means flash storage is more durable). Increased speed and better performance means apps and programs can launch more quickly. The capabilities of flash storage have become sorely needed in the business world since companies are now dealing with large amounts of information in the form of big data. To properly process and analyze big data, more businesses are turning to flash, which has sped up its adoption.
While it’s clear that flash array storage features a number of advantages in comparison to HDD, these advantages don’t automatically mean it is destined to be the sole storage option in the future. For such a reality to come about, solutions to a number of flash storage problems need to be found. The biggest concern and largest drawback to flash storage is the price tag. Hard drives have been around a long time, which is part of the reason the cost to manufacture them is so low. Flash storage is a more recent technology, and the price to use it can be a major barrier limiting the number of companies that would otherwise gladly adopt it. A cheap hard drive can be purchased for around $0.03 per GB. Flash storage is much more expensive at roughly $0.80 per GB. While that not seem like much, keep in mind that’s about 27 times more expensive. For businesses being run on a tight budget, hard drives seem to be the more practical solution.
Beyond the price, flash storage may also suffer from performance problems down the line. While it’s true that flash storage is faster than HDD, it also has a more limited lifespan. Flash cells can only be rewritten so many times, so the more times a business uses it, the more performance will suffer. New technology has the potential to increase that lifespan, but it’s still a concern that enterprises will have to deal with in some fashion. Another problem is that many applications and systems that have been in use for years were designed with hard drives in mind. Apps and operating systems are starting to be created with SSD as the primary storage option, but more changes to existing programs need to happen before flash storage becomes the dominant storage solution.
So getting back to the original question, when will flash storage be the new king of storage options? Or is such a future even likely? Experts differ on what will happen within the next few years. Some believe that it will be a full decade before flash storage is more widely used than hard drives. Others have said that looking at hard drives and flash storage as competitors is the wrong perspective to have. They say the future lies with not one or the other but rather both used in tandem through hybrid systems. The idea would be to use flash storage for active data that is used frequently, while hard drives would be used for bulk storage and archive purposes. There are also experts who say discussion over which storage option will win out is pointless because within the next decade, better storage technologies like memristors, phase-change memory, and even atomic memory will become more mainstream. However the topic is approached, current advantages featured in flash storage make it an easy choice for enterprises with the resources to use it. For now, the trend of more flash looks like it will continue its impressive growth.
"At the toolbar (menu, whatever) associated with a document there is a button marked "Oh, yeah?". You press it when you lose that feeling of trust. It says to the Web, 'so how do I know I can trust this information?'. The software then goes directly or indirectly back to metainformation about the document, which suggests a number of reasons.” [[Tim Berners Lee, 1997]]
“The problem is – and this is true of books and every other medium – we don’t know whether the information we find [on the Web] is accurate or not. We don’t necessarily know what its provenance is.” – Vint Cerf
The Worldwide Web Consortium (W3) has hit another home-run when the RDF PROVenance Ontology officially became a member of the Resource Description Framework last May. This timely publication proposes a data model well-suited to its task: representing provenance metadata about any resource. Provenance data for a thing relates directly to its chain of ownership, its development or treatment as a managed resource, and its intended uses and audiences. Provenance data is a central requirement for any trust-ranking process that often occurs against digital resources sourced from outside an organization.
The PROV Ontology is bound to have important impacts on existing provenance models in the field, including Google’s Open Provenance Model Vocabulary; DERI’s X-Prov and W3P vocabularies; the open-source SWAN Provenance, Authoring and Versioning Ontology and Provenance Vocabulary; Inference Web’s Proof Markup Language-2 Ontology; the W3C’s now outdated RDF Datastore Schema; among others. As a practical matter, the PROV Ontology is already the underlying model for the bio-informatics industry as implemented at Oxford University, a prominent thought-leader in the RDF community.
At the core of the PROV Ontology is a conceptual data model with semantics instantiated by serializations including RDF and XML plus a notation aimed at human consumption. These serializations are used by implementations to interchange provenance data. To help developers and users create valid provenance, a set of constraints are defined, useful to the creation of provenance validators. Finally, to further support the interchange of provenance, additional definitions are provided for protocols to locate access and connect multiple provenance descriptions and,most importantly how to interoperate with the widely used Dublin Core two metadata vocabularies.
The PROV Ontology is slightly ambitious too despite the perils of over-specification. It aims to provide a model not just for discrete data-points and relations applicable to any managed-resource, but also for describing in-depth the processes relevant to its development as a concept. This is reasonable in many contexts — such as a scholarly article, to capture its bibliography — but it seems odd in the context of non-media resources such as Persons. For instance, it might be odd to think of a notation of one’s parents as within the scope of “provenance data”. The danger of over-specification is palpable in the face of grand claims that, for instance, scientific publications will be describable by the PROV Ontology to an extent that reveals “How new results were obtained: from assumptions to conclusions and everything in between” [W3 Working Group Presentation].
Recommendations. Enterprises and organizations should immediately adopt the RDF PROVenance Ontology in their semantic applications. At a minimum this ontology should be deeply incorporated within the fundamentals of any enterprise-wide models now driving semantic applications, and it should be a point of priority among key decision-makers. Based upon my review and use in my clients’ applications, this ontology is surely of a quality and scope that it will drive a massive amount of web traffic clearly to come in the not distant future. A market for user-facing ‘trust’ tools based on this ontology should begin to appear soon that can stimulate the evolution of one’s semantic applications.
Insofar as timing, the best strategy is to internally incubate the present ontology, with plans to then fully adopt the second Candidate Recommendation. This gives the standardization process for this Recommendation a chance to achieve a better level of maturity and completeness.
Over the last month I’ve been talking a lot about personally controlled records and the ownership of your own information. For more background, see last month’s post and a discussion I took part in on ABC radio.
The strength of the response reinforces to me that this is an area that deserves greater focus. On the one hand, we want business and government to provide us with better services and to effectively protect us from danger. On the other, we don’t want our personal freedoms to be threatened. The question for us to ask is whether we are at risk of giving up our personal freedom and privacy by giving away our personal information.
I couldn’t help but think about that most obvious of literary works: George Orwell’s 1984. Like many teenagers of my generation, I read the book for the first time in 1984 right at the peak of the Cold War. My overwhelming feeling was one of cultural arrogance, Orwell had gotten it wrong and the story did not apply to my society even though it probably was relevant for others.
In 2013 we are nearly as far from the year 1984 as George Orwell was when he wrote the book in 1948. Arguable, as much has happened since 1984 as had occurred between 1948 and 1984. The book introduces many interesting ideas including “telescreens”, “thoughtcrime” and “newspeak”. While the forces that Orwell wrote about have not been the driver for these concepts to come to reality, much of their essence may well have slipped into our society without us noticing.
The ubiquitous telescreen of the book was a frightening device that combined a television with a camera which allowed authorities to watch what you were doing at all times. While the technology has been around since Orwell himself, it really hasn’t been until the rise of the smartphone that constant monitoring has become possible.
While we aren’t being monitored visually, we are increasingly giving away large amounts of personal information in terms of our location. Worse, it is starting to become a suspicious act when we choose to take ourselves off this form of tracking for a period of time. To see how this is playing out in the courts, just look at criminal trials where the defendant is asked to justify why they’ve turned their phone off at the time of a crime taking place.
In the 1984 that we all experienced, freedom of thought was entrenched through institutions such as a free press and free libraries supporting research without fear of surveillance. By 2013, many of these institutions have either moved online entirely or are well on their way to doing so. Far from providing the protection of a library system that ensured complete confidentiality of research topics, any government can see what interests most of its citizens choose to pursue through Wikipedia or any other research tool.
The Orwellian concept of thoughtcrime assumed that there was some sort of hint at unconventional thoughts that could be a risk to their society. It is easy to see that today’s governments could come to the same conclusion using the tools of the internet to identify what they deem to be antisocial interests.
Finally, newspeak was the language that the shadowy rulers of 1984 were creating to dumb-down the population and discourage thoughtcrimes. While it might be a stretch, it is staggering to see how the short-form of modern messaging such as Twitter is encouraging a simplification of our language which is finding its way into the mainstream.
It is easy to write a post that claims conspiracies at every turn. Far from arguing a major government plan to undermine our freedom, I thought that it was interesting to see that many of George Orwell’s fears are coming true. The cause is not an oppressive government but rather an eagerness by the population as a whole to move services onto new platforms without demanding the same level of protection that their previous custodians have provided for a couple of centuries or more.
If you’re like most folks in the corporate world, you continue to be flooded with e-mails. Do any of these questions sound familiar?
- Can you e-mail that to me?
- Did you send that e-mail?
- Why didn’t I receive that e-mail yet?
- Should I email you that again?
- That seems to have bounced. Is the file too big for your email?
It’s no understatement to say that most large organizations still rely upon e-mail as their killer app. This holds true despite the fact that wikis, file sharing sites, microblogging tools, and their ilk are often far better communication mechanisms and repositories of information. In some companies, they have replaced intranets and knowledge bases anymore. They seem so, er, 1990s.
So, we know that e-mail is an important tool, even though it should not be the exclusive one. Weaning many people and companies off of e-mail is no small endeavor. As I have written about before, old habits die hard.
So, what can we do to improve things? How can knowledge be gleaned from millions of e-mails, rife with important–yet unstructured–data?
Technology seems to have created a monster but, as is often the case, technology can also solve the very problems it has created. In this case, semantic technologies are particularly well-suited for just this type of thing. We’re not just talking about simple, text-based searches for keywords–e.g., choice profanities or individuals’ names. We’re talking about much, much more.
Consider a recent New York Times’ article that sheds light on the increasing ability of new programs and technologies to interpret meaning from e-mails. From the piece:
…Cataphora software can also recognize the sentiment in an e-mail message — whether a person is positive or negative, or what the company calls “loud talking” — unusual emphasis that might give hints that a document is about a stressful situation. The software can also detect subtle changes in the style of an e-mail communication.
A shift in an author’s e-mail style, from breezy to unusually formal, can raise a red flag about illegal activity.
“You tend to split a lot fewer infinitives when you think the F.B.I. might be reading your mail,” said Steve Roberts, Cataphora’s chief technology officer.
Examples and Ramifications
The implications of the effective use of this technology are vast. Let’s say that you work in HR and have reservations about key employees defecting to one of your competitors. What if you could proactively discover and address their concerns, ultimately keeping them with your company?
Or consider a company attempting to launch a new product key to its future. What if you could determine the state of the product beyond formal status reports and project plans? What if employee verb choices could help you spot red flags?
Legally, at least many US courts have ruled that employees have an expectation of privacy while on the phone. However, they don’t have that same expectation while using company e-mail.
Stay tuned here. It’s going to be a wild ride. If Watson can beat supersmart humans on Jeopardy!, imagine what computers will be able to do with e-mails. I can think of good, bad, and ugly uses of these new technologies.
What say you?
ECM3 (ecm3.org) has been without a doubt the most successful maturity model for ECM (Enterprise Content Management – aka Document Management) with downloads of the model passing the 5,000 mark recently. So how to top success with more success? Well we have decided to merge efforts with MIKE2.0, the de-facto maturity model for structured data. Our hope is that by adding our work in unstructure data to MIKE2.0, that we can spread the love even further and help raise the profile and importance of ECM.
Enterprises face ever-increasing volumes of content. The practice of Enterprise Content Management (ECM) attempts to address key concerns such as content storage; effective classification and retrieval; archiving and disposition policies; mitigating legal and compliance risk; reducing paper usage; and more.
However, enterprises looking to execute on ECM strategies face myriad human, organizational, and technology challenges. As a practical matter, enterprises cannot deal with all of these challenges concurrently. Therefore, to achieve business benefits from ECM, enterprises need to work step-by-step, following a roadmap to organize their efforts and hold the attention of program stakeholders.
The ECM Maturity Model (ECM3) attempts to provide a structured framework for building such a roadmap, in the context of an overall strategy. The framework suggests graded levels of capabilities — ranging from rudimentary information collection and basic control through increasingly sophisticated levels of management and integration — finally resulting in a mature state of continuous experimentation and improvement.
Level 1: Unmanaged
Level 2: Incipient
Level 3: Formative
Level 4: Operational
Level 5: Pro-Active
Like all maturity models, it is partly descriptive and partly prescriptive. You can apply the model to audit, assess, and explain your current state, as well as inform a roadmap for maturing your enterprise capabilities. It can help you understand where you are over- and under-investing in one dimension or another (e.g., overspending on technology and under-investing in content analysis), so you can re-balance your portfolio of capabilities. The model can also facilitate developing a common vocabulary and shared vision among ECM project stakeholders. And it is our fervent hope that the ECM model we work we started, will be continued, expanded upon and itself mature with the MIKE2.0 community.
Virtually every company is thinking about how to drive digital growth, getting more and more visitors and maybe even establishing something like an online community driven by forums and social networking type functionality. VCs still believe that the number one driver for valuation of a digital business is number of visitors. So how to do you really drive digital growth? Here are my top lessons learned from projects in the media and comms industry:
1. Make sure you have SEO compliant coding of the website
2. Employ SEO and online marketing specialists to drive traffic
3. Use site visitor analytics to understand their behaviour on the site and to adjust your content and navigation accordingly
4. Perform a connected customer analysis to understand what is hot in the market, what are people talking about etc. to adjust your content accordingly
5. Execute on focused 3rd party deals to drive traffic and brand awareness with specific relevance to your site
6. Create sticky applications like tools and games to increase hits per user, visit length but also get additional users by allowing them to share the tool with others
7. Drive cross sales between your offline business and digital, eg with deals or links related to your offline products that drive traffic to your website
8. Configure a site search engine with solid categorisation and predictive functionality to drive traffic through search visits
9. Write good and dynamic content to drive repeat traffic
10. Use personalisation features to engage users with ‘their’ site
11. Drive organic SEO by cross posting on other sites that have high page rank value
12. Pay bloggers or review sites to write about your site or, even better, specific contenton your site
13. Use of rich media and multiple channels (e.g. mobile access) to drive enhanced customer experience and traffic
Just my two cents…
TODAY: Fri, March 24, 2017March2017