Difference between Products, Solutions and Services

Difference between Products, Solutions and Services

Products = Software Product or Utility such as Database Backup Utility, Encryption software.
Solutions = Answers to Problems Such as High Performance Computing solution on Cloud…How to implement HPC on Cloud.
Services = Supply of information and Support by providing expertise because of Experience.

 

Cloud Computing – Download Free EBooks and Whitepapers
Java – Download Free EBooks and Whitepapers
Windows – Download Free EBooks and Whitepapers

PaaS BPaaS IaaS Trends in 2012

PaaS

       More startups in the PaaS space focusing on different language stacks because of cost reduction and business agility.

       Forrester is predicting this category will grow from $0.82 B in 2011 to $11.91 B in 2020.

       Forrester is predicting this category will grow from $2.94 B in 2011 to $4.78 B in 2020.

       Platform-as-a-service (PaaS) is the next big opportunity

       PaaS will be used to reduce multiplatform cost and, maintenance & support cost and to achieve high resource utilization.

       To get time-boxed projects done at a low total cost

       IT organizations will have positive experience using a PaaS service and will  be increasing its use.

Cloud Computing – Download Free EBooks and Whitepapers

BPaaS

       Forrester is predicting this category will grow from $0.53 B in 2011 to $10.02 B in 2020

       Mainstreaming Continues – Everything “As a Service” The movement toward mainstreaming the cloud has been validated by popular services from major consumer players and continues to grow the cloud share of technology spend.

IaaS

       Expected or realized cost savings of 30%–40% or Higher

       Infrastructure-as-a-service (IaaS) takes off but gets complicated

       IaaS market leaders aren’t immune to these cloud computing trends. Increased competition and pressure to grow profits will motivate them to expand into the SaaS and PaaS markets.

Public Cloud Trends in 2013

Public Cloud Trends

Forrester forecasts the market will grow from $25.5 billion in 2011 to $159.3 billion in 2020. Major trends in public cloud will be like:

  • Emphasis on Cloud Security
  • Adoption of services will be by informal (mid-market, SMB) buyer
  • Governance and compliance will gain more visibility
  • Inclination towards the IaaS and PaaS models
  • “Breaking the ice” phase is over.
  • Moved from hype to reality
  • High Performance Computing (HPC) in the cloud.
  • Double-edged sword: Better Services & Security issues
  • Public clouds converge and become more business-centric
  • Public cloud pricing structures are evolving, but not always as expected or for the better.
  • Public clouds are changing the IT service management (ITSM) landscape.
  • ITSM technology has a big role to play in managing public clouds.
 Cloud Computing – Download Free EBooks and Whitepapers

Big Data with Hadoop and Cloud

Big Data with Hadoop and Cloud

What is Big Data?

“Big Data” is a catch phrase that has been bubbling up from the high performance computing niche of the IT market. Increasingly suppliers of processing virtualization and storage virtualization software have begun to flog “Big Data” in their presentations. What, exactly, does this phrase mean?

Java – Download Free EBooks and Whitepapers

“Big data” is data that becomes large enough that it cannot be processed using conventional methods.

Web search engines, social networks, mobile phones, sensors and science chip in to petabytes of data created on a daily basis. Scientists, intelligence analysts, governments, meteorologists, air traffic controllers, architects, civil engineers-nearly every industry or profession experience the era of big data. Add to that the fact that the democratization of IT has made everyone a (sort of) data expert, well-known with searches and queries, and we’re seeing a huge burst of awareness in big data.

An example often cited is how much weather data is collected on a daily basis by the U.S. National Oceanic and Atmospheric Administration (NOAA) to aide in climate, ecosystem, weather and commercial research. Add that to the masses of data collected by the U.S. National Aeronautics and Space Administration (NASA) for its research and the numbers get pretty big.

The greater part of data has multifaceted and undiscovered relationships. It doesn’t fit simply into relational models.

Practical examples [1] for big data processing are:

A.LinkedIn:

·For discovering People You May Know and other fun facts.

·Item-Item Recommendations

·Member and Company Derived Data

·User’s network statistics

·Who Viewed My Profile?

·Abuse detection

·User’s History Service

·Relevance data

·Crawler detection

B.MobileAnalytic.TV:

·Natural Language Processing

·Mobile Social Network Hacking

·Web Crawlers/Page scrapping

·Text to Speech

·Machine generated Audio & Video with remixing

·Automatic PDF creation & IR

C.Datagraph

·Batch-processing large RDF datasets, for indexing RDF data. RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link.

·Executing long-running offline SPARQL queries

D.GumGum-Iin-image ad network

·GumGum is an analytics and monetization platform for online content.

·Image and advertising analytics

E.Lineberger Comprehensive Cancer Center – Bioinformatics Group

·For accumulating and analyzing Next Generation sequencing data produced for the Cancer Genome Atlas project and other groups.

F.Pharm2Phork Project – Agricultural Traceability

·Processing of observation messages generated by RFID/Barcode readers as items move through supply chain.

·Analysis of BPEL generated log files for monitoring and tuning of workflow processes.

Java™ Application Development on Linux® – Free 599 Page eBook

Enterprise Java Virtualization:

Understanding the TCO Implications

InfoWorld’s Java IDE Comparison Strategy Guide:

Java Essential Training

Apache Jakarta Commons: Reusable Java™ Components

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Why it is important for enterprises to look into this

Human-generated data fits well into relational tables or arrays; Examples are conventional transactions – purchase/sale, inventory/manufacturing, employment status change, etc.

Another type of data is the machine generated data. Machines produce unstoppable streams of big data[2]:

1.Computer logs

2.Satellite telemetry (espionage or science)

3.GPS outputs

4.Temperature and environmental sensors

5.Industrial sensors

6.Video from security cameras

7.Outputs from medical devises

8.Seismic and Geo-physical sensors

Big data that doesn’t be conventional to known models is discarded or sent to archive un-analyzed. As a result, Enterprises miss information, insight, and prospects to extract new value.

Various Solutions

Big Data requires exceptional technologies to efficiently process large quantities of data within tolerable elapsed times. Technologies being applied to Big Data include massively parallel processing (MPP) databases, data mining infrastructures such as the Apache Hadoop Framework, distributed file systems, distributed databases, MapReduce algorithms, and cloud computing platforms, the Internet, and archival storage systems.

MapReduce is a programming model and an associated implementation for processing and generating big data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Computational processing can take place on data stored either in a file system (unstructured) or within a database (structured). Programs written in this functional style are automatically parallelized and executed on a big cluster of commodity machines. This allows programmers without any experience with parallel and distributed systems to effortlessly utilize the resources of a large distributed system.

Map Reduce (Source: google)

Map Reduce (Source: google)

There are two ways to process “Big data” with the use of MapReduce: 1) HPC 2) Cloud Computing

HPC includes advanced computing, communications, and information technologies. It includes scientific workstations, supercomputer systems, high speed networks, special purpose and experimental systems. New generation of large scale parallel systems, and applications and systems software with all components well incorporated and linked over a high speed network are used for big data processing.

Second way is to process “Big Data” with Cloud computing. It will be a key break through in Data processing due to benefits of using a Cloud Computing which are:

·Easy and inexpensive set-up because hardware, application and bandwidth costs are covered by the provider

·Scalability to meet needs.

·No wasted resources because you pay for what you use.

There are different ways to implement big data processing in the Cloud like 1) Hive 2) Pig and 3) Hadoop

Hive provides a rich set of tools in multiple languages to perform SQL-like data analysis on data stored in HDFS. Pigis usedfor writing SQL-like operations that apply to datasets. Pig project provides a compiler that produces MapReduce jobs from a Pig Latin script. Our major attention is on Hadoop. We can add flavor by introducing Hadoop for big data processing in Cloud.

Apache Hadoop is a software framework inspired by Google’s MapReduce and Google File System (GFS) papers.

Hadoop and its Usecases

Hadoop and its Usecases

Hadoop MapReduce is a programming model for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes. [3]

Hadoop processes and analyzes variety of new and older data to extract meaningful business operations intelligence. Traditionally data moves to the computation node. In Hadoop, data is processed where the data resides. The types of questions one Hadoop helps answer are[2]:

·Event analytics — what series of steps lead a purchase or registration?

·Large scale web click stream analytics

·Revenue assurance and price optimizations

·Financial risk management and affinity engine etc.

How Cloud Computing comes into picture?

In Cloud Computing, we have few options available for Hadoop implementation.1) Amazon IaaS 2) Amazon MapReduce 3) Cloudera

Amazon Elastic Compute Cloud (Amazon EC2 / IaaS) is a web service that provides resizable compute capacity in the cloud[4]. It is designed to make web-scale computing easier for developers. If you run Hadoop on Amazon EC2 you might consider using AmazonS3 for accessing job data (data transfer to and from S3 from EC2 instances is free). Initial input can be read from S3 when a cluster is launched. The final output can be written back to S3 before the cluster is decommissioned[6]. Intermediate, temporary data, only needed between MapReduce passes, is more efficiently stored in Hadoop’s DFS. It became a popular way for big data processing and that lead to the emergence of another service called Amazon Elastic MapReduce.

Amazon Elastic MapReduce, a web service enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data[5].

Amazon Elastic Map Reduce

Amazon Elastic Map Reduce

It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon S3. In a nutshell, the Elastic MapReduce service runs a hosted Hadoop instance on an EC2 instance (master). It’s able to instantly provision other pre-configured EC2 instances (slave nodes) to distribute the MapReduce process[5]. All nodes are terminated once the MapReduce tasks complete running.

Cloudera has two products: Cloudera’s Distribution for Hadoop (CDH) and Cloudera Enterprise. CDH is a data management platform (incorporates HDFS, Hadoop MapReduce, Hive, Pig, HBase, Sqoop, Flume, Oozie, Zookeeper and Hue). It is available free under an Apache license.[7]

Cloudera Enterprise is a package which includes Cloudera’s Distribution for Hadoop, production support and tools designed to make it easier to run Hadoop in a production environment. Cloudera offers services including support, consulting services and training (both public and private).

Cloudera

Cloudera

The Cloudera’s Distribution for Hadoop (CDH) cloud scripts enables you to run Hadoop on cloud providers’ clusters. There’s no need to install the RPMs for CDH or do any configuration; a working cluster will start immediately with one command. Cloudera supports Amazon EC2 only. Cloudera provides Amazon Machine Images and associated launch scripts that make it easy to run CDH on EC2. CDH being open source is free and management services have to be paid for.

References

[1]Hadoop Wiki, http://wiki.apache.org/hadoop/PoweredBy

[2]Miha Ahronovitz, Kuldip Pabla, Why Hadoop as part of the IT?, http://thecloudtutorial.com/hadoop-tutorial.html

[3]Apache Hadoop, http://hadoop.apache.org/

[4]Amazon EC2, http://aws.amazon.com/ec2/

[5]Amazon Elastic MapReduce, http://aws.amazon.com/elasticmapreduce/

[6]Ubin Malla, Using Hadoop and Amazon Elastic MapReduce to Process Your Data More Efficiently, http://blog.controlgroup.com/2010/10/13/hadoop-and-amazon-elastic-mapreduce-analyzing-log-files/

[7]Cloudera, Apache Hadoop for Enterprise, http://www.cloudera.com/

[8]Amazon EC2 Cost Comparison Calculator, http://media.amazonwebservices.com/Amazon_EC2_Cost_Comparison_Calculator.xls

What is Big Data?

“Big Data” is a catch phrase that has been bubbling up from the high performance computing niche of the IT market. Increasingly suppliers of processing virtualization and storage virtualization software have begun to flog “Big Data” in their presentations. What, exactly, does this phrase mean?

“Big data” is data that becomes large enough that it cannot be processed using conventional methods.

Web search engines, social networks, mobile phones, sensors and science chip in to petabytes of data created on a daily basis. Scientists, intelligence analysts, governments, meteorologists, air traffic controllers, architects, civil engineers-nearly every industry or profession experience the era of big data. Add to that the fact that the democratization of IT has made everyone a (sort of) data expert, well-known with searches and queries, and we’re seeing a huge burst of awareness in big data.

An example often cited is how much weather data is collected on a daily basis by the U.S. National Oceanic and Atmospheric Administration (NOAA) to aide in climate, ecosystem, weather and commercial research. Add that to the masses of data collected by the U.S. National Aeronautics and Space Administration (NASA) for its research and the numbers get pretty big.

The greater part of data has multifaceted and undiscovered relationships. It doesn’t fit simply into relational models.

Practical examples [1] for big data processing are:

A.LinkedIn:

·For discovering People You May Know and other fun facts.

·Item-Item Recommendations

·Member and Company Derived Data

·User’s network statistics

·Who Viewed My Profile?

·Abuse detection

·User’s History Service

·Relevance data

·Crawler detection

B.MobileAnalytic.TV:

·Natural Language Processing

·Mobile Social Network Hacking

·Web Crawlers/Page scrapping

·Text to Speech

·Machine generated Audio & Video with remixing

·Automatic PDF creation & IR

C.Datagraph

·Batch-processing large RDF datasets, for indexing RDF data. RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link.

·Executing long-running offline SPARQL queries

D.GumGum-Iin-image ad network

·GumGum is an analytics and monetization platform for online content.

·Image and advertising analytics

E.Lineberger Comprehensive Cancer Center – Bioinformatics Group

·For accumulating and analyzing Next Generation sequencing data produced for the Cancer Genome Atlas project and other groups.

F.Pharm2Phork Project – Agricultural Traceability

·Processing of observation messages generated by RFID/Barcode readers as items move through supply chain.

·Analysis of BPEL generated log files for monitoring and tuning of workflow processes.

Why it is important for enterprises to look into this

Human-generated data fits well into relational tables or arrays; Examples are conventional transactions – purchase/sale, inventory/manufacturing, employment status change, etc.

Another type of data is the machine generated data. Machines produce unstoppable streams of big data[2]:

1.Computer logs

2.Satellite telemetry (espionage or science)

3.GPS outputs

4.Temperature and environmental sensors

5.Industrial sensors

6.Video from security cameras

7.Outputs from medical devises

8.Seismic and Geo-physical sensors

Big data that doesn’t be conventional to known models is discarded or sent to archive un-analyzed. As a result, Enterprises miss information, insight, and prospects to extract new value.

Various Solutions

st1:*{behavior:url(#ieooui) }

/* Style Definitions */

table.MsoNormalTable

{mso-style-name:”Table Normal”;

mso-tstyle-rowband-size:0;

mso-tstyle-colband-size:0;

mso-style-noshow:yes;

mso-style-parent:””;

mso-padding-alt:0in 5.4pt 0in 5.4pt;

mso-para-margin:0in;

mso-para-margin-bottom:.0001pt;

mso-pagination:widow-orphan;

font-size:10.0pt;

font-family:”Times New Roman”;

mso-ansi-language:#0400;

mso-fareast-language:#0400;

mso-bidi-language:#0400;}

What is Big Data?

“Big Data” is a catch phrase that has been bubbling up from the high performance computing niche of the IT market. Increasingly suppliers of processing virtualization and storage virtualization software have begun to flog “Big Data” in their presentations. What, exactly, does this phrase mean?

“Big data” is data that becomes large enough that it cannot be processed using conventional methods.

Web search engines, social networks, mobile phones, sensors and science chip in to petabytes of data created on a daily basis. Scientists, intelligence analysts, governments, meteorologists, air traffic controllers, architects, civil engineers-nearly every industry or profession experience the era of big data. Add to that the fact that the democratization of IT has made everyone a (sort of) data expert, well-known with searches and queries, and we’re seeing a huge burst of awareness in big data.

An example often cited is how much weather data is collected on a daily basis by the U.S. National Oceanic and Atmospheric Administration (NOAA) to aide in climate, ecosystem, weather and commercial research. Add that to the masses of data collected by the U.S. National Aeronautics and Space Administration (NASA) for its research and the numbers get pretty big.

The greater part of data has multifaceted and undiscovered relationships. It doesn’t fit simply into relational models.

Practical examples [1] for big data processing are:

A.LinkedIn:

·For discovering People You May Know and other fun facts.

·Item-Item Recommendations

·Member and Company Derived Data

·User’s network statistics

·Who Viewed My Profile?

·Abuse detection

·User’s History Service

·Relevance data

·Crawler detection

B.MobileAnalytic.TV:

·Natural Language Processing

·Mobile Social Network Hacking

·Web Crawlers/Page scrapping

·Text to Speech

·Machine generated Audio & Video with remixing

·Automatic PDF creation & IR

C.Datagraph

·Batch-processing large RDF datasets, for indexing RDF data. RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link.

·Executing long-running offline SPARQL queries

D.GumGum-Iin-image ad network

·GumGum is an analytics and monetization platform for online content.

·Image and advertising analytics

E.Lineberger Comprehensive Cancer Center – Bioinformatics Group

·For accumulating and analyzing Next Generation sequencing data produced for the Cancer Genome Atlas project and other groups.

F.Pharm2Phork Project – Agricultural Traceability

·Processing of observation messages generated by RFID/Barcode readers as items move through supply chain.

·Analysis of BPEL generated log files for monitoring and tuning of workflow processes.

Why it is important for enterprises to look into this

Human-generated data fits well into relational tables or arrays; Examples are conventional transactions – purchase/sale, inventory/manufacturing, employment status change, etc.

Another type of data is the machine generated data. Machines produce unstoppable streams of big data[2]:

1.Computer logs

2.Satellite telemetry (espionage or science)

3.GPS outputs

4.Temperature and environmental sensors

5.Industrial sensors

6.Video from security cameras

7.Outputs from medical devises

8.Seismic and Geo-physical sensors

Big data that doesn’t be conventional to known models is discarded or sent to archive un-analyzed. As a result, Enterprises miss information, insight, and prospects to extract new value.

Various Solutions

Big Data requires exceptional technologies to efficiently process large quantities of data within tolerable elapsed times. Technologies being applied to Big Data include massively parallel processing (MPP) databases, data mining infrastructures such as the Apache Hadoop Framework, distributed file systems, distributed databases, MapReduce algorithms, and cloud computing platforms, the Internet, and archival storage systems.

MapReduce is a programming model and an associated implementation for processing and generating big data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Computational processing can take place on data stored either in a file system (unstructured) or within a database (structured). Programs written in this functional style are automatically parallelized and executed on a big cluster of commodity machines. This allows programmers without any experience with parallel and distributed systems to effortlessly utilize the resources of a large distributed system.

Big Data requires exceptional technologies to efficiently process large quantities of data within tolerable elapsed times. Technologies being applied to Big Data include massively parallel processing (MPP) databases, data mining infrastructures such as the Apache Hadoop Framework, distributed file systems, distributed databases, MapReduce algorithms, and cloud computing platforms, the Internet, and archival storage systems.

MapReduce is a programming model and an associated implementation for processing and generating big data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Computational processing can take place on data stored either in a file system (unstructured) or within a database (structured). Programs written in this functional style are automatically parallelized and executed on a big cluster of commodity machines. This allows programmers without any experience with parallel and distributed systems to effortlessly utilize the resources of a large distributed system.

Single Sign-On for Java and Web Applications

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Transforming a Generic Java IDE to Your Application Specific IDE:

The Java Virtual Appliance—No OS Required

BEA WebLogic® Operations Control: Application Virtualization for Enterprise Java

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications: