Sunday 19 August 2012

Big Data -The three Vs

A bit of taxonomy for this blog. I'm currently taking part in a Big Data initiative within my company and I was thinking of sharing some important findings here.
Before doing that, let's define Big Data .
 I will ask Wikipedia for some help:
“Big data" is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data in a single data set".

Basically, we are talking about massive volume of data that commonly used software are not able to process quickly.
Is that all? Not quite.
Big Data is defined by three characteristics, which form what is better known as the three Vs: Volume, Variety and Velocity.


Volume 
We already talked about this in the definition. Single data sets being over dozen of terabytes up to many petabytes.

Variety
This is a key characteristic. Variety represents all types of data. We shift from traditional structured data (e.g. normal database tables), to semi structured (e.g.. XML files with free description) to completely unstructured data like web logs, tweets and social networking feeds, sensor logs, surveys, medical checks, genome mappings, etc..
Traditional analytics platform can't handle variety.

Velocity

Commonly, in the traditional data world,velocity means how fast the data are coming in and getting stored. Traditional  analytics  platform deals with pretty much static data. RDBMS tables with a velocity that is pretty much under control.
In the Big Data world data are fast mutable, continuously changing and increasing in volume. We are talking about real-time, streaming data. A web log, a tweet hash-tag, are continuously growing and flowing. In the Big Data world, velocity is the speed at which  the data is flowing.


Putting the three characteristics together, a  Big Data initiative should be able to cope with massive volume of fast flowing, mostly unstructured, data.
To achieve what? This will be the subject of my next few posts.
Antonio




Saturday 12 May 2012

Rapid Prototyping : key for innovation


I recently attended a technology summit in London. Quite an interesting one, which I will write about in the next few days.
I was part of the CAB (Customer Advisory Board) of the technology vendor who actually organized the summit.

One of the key questions that arose at the CAB meeting was around innovation. What do you need to innovate? What are the key factors that you find important for innovation?

I didn’t have to think much about the answer, given that I’m actually facing it every day.
Rapid prototyping is becoming a key factor for innovation.

In the era of SAAS explosion all the software features are out there. We are no longer developing features, we are composing features offered by different vendors. 






Composing working applications  through services orchestration is now faster than ever. Tools for rapid prototyping of service composition will play a key  role in the future of innovation

Customers are asking everyday for working prototypes “Give me something that I can see and use rather than showing me another power point”, was a recent comment I heard.

Showing a working prototype on a mobile device, an IPad or Android tablet rather than on a power point, can be that difference that will make your innovative idea stick.
That’s what it does, rather than that’s what I promise it will do.

Antonio

Thursday 23 February 2012

How do you move to SOA? A pragmatic approach.

This is a common question and there isn't a simple answer.
How do you actually make your entire software architecture to be 100% SOA compliant? Which means that all the components of your software architecture are designed to be services?
Well, in my opinion, you need to define a clear roadmap for legacy applications governance and design standards for new components. It is not something that can happen overnight, but an incremental approach is generally necessary; it is a journey for the whole organization, not a just a new process to implement.

Someone else though, realized that a quicker and more pragmatic approach would work better.

I stumbled upon an old email sent by the Amazon CEO to his software developers, designers and architects, in order to introduce SOA.  One of the employees shared the email, by mistake, on his public technology blog.

Before the email was sent, Amazon' software architecture was a silo architecture, without any internal API.
To tackle this problem, the CEO "suggested" this:

subject: Amazon shall use SOA!

body:

  1.  “All teams will henceforth expose their data and functionality through service interfaces
  2. Teams must communicate with each other through these interfaces
  3. There will be no other form of interprocess communication allowed: no direct linking, no direct reads of another team's data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network
  4. It doesn't matter what [API protocol] technology you use.
  5. Service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.
  6. Anyone who doesn't do this will be fired.
  7. Thank you; have a nice day!”



Quite a pragmatic approach for SOA governance, but looking at what the Amazon Cloud has become nowadays, it appears to me that it worked quite well.


Antonio

Friday 3 February 2012

Volunia Search Engine to be launched soon

Volunia, the Italian web search engine, is due to be launched on Monday the 7th of February.


Developed by Massimo Marchiori (one of the top 100 researchers in the world, in the picture with Tim Berners-Lee), it will be launched via web conference from the University of Padua, where Massimo currently teaches Databases and Information Systems.

Massimo was the creator of the HyperSearch technique which  is currently adopted by most of the search engines (Google above all).

He claims the new engine will have some new features that all the other search engines will adopt within the next four or five years. He stated that it is more than another search engine, "It will be different, it will do things that Google is currently not able to do".



I'm looking forward to the launching event. I will post an update here to discuss the new features and the core technology concepts behind Volunia search engine.

To attend the event, check this web  site: http://launch.volunia.com/

Antonio