Tutorial: Struts Portlet in Weblogic Portal Server


Struts Portlet in Weblogic Portal Server

Java – Download Free EBooks and Whitepapers

Table 4-5 describes each WebLogic Portal-specific field of the dialog.

Table 4-5 New Portal Web Project Dialog Data Fields – Weblogic Portal Information 

Field

Description

Configurations dropdown menu

The value automatically displayed in this dropdown menu corresponds to the selections made in the tree view of project facets. You can select a preset group of facets from the dropdown menu, or select and unselect specific check boxes in the tree display. If you select a customized set of facets, displays in the field.

Project Facet Display Tree

Weblogic Portal primary

Select the Weblogic Portal facets that you want to install. If certain facets depend on others, messages appear to describe these dependencies and your selections must conform to these requirements.

§  Portal Customizations Framework

§   Portal Framework

§    Portal Framework Struts

§     Portal Visitor Tools

§Portal Web Application Services

§WSRP Producer

For more information about each of these sub-features, refer to Weblogic Portal J2EE Library Descriptions.

Weblogic Portal (Optional)

Check this box to choose from optional facets. Optional facets include commerce tag libraries, analytics integration, and content presenter framework.

For more information about these features, refer to Weblogic Portal J2EE Library Descriptions.

Weblogic Portal Collaboration

Check this box (and one or both of its sub-features) to add the collaboration portlets to the project, or to enable the project as a GroupSpace project.

§  Collaboration Portlets – causes the J2EE library wlp-collab-portlets-app-lib to be associated with your project. You can use these portlets outside a GroupSpace environment.

§   GroupSpace – causes the GroupSpace-related J2EE libraries to be associated with the project. If you select this option, you must also select the Collaboration Portlets sub-feature.

For detailed instructions on creating a GroupSpace-based application, refer to the Communities Guide

Note:        Do not add GroupSpace to portal web projects that already contain non-GroupSpace portals. For more information, refer to the Communities Guide.

 

Show Runtimes

Click to view the runtimes associated with this web project.

New Portal Web Project – Web Module Dialog

The New Portal Web Project – Web Module dialog is shown in Figure 4-5.

Figure 4-6 New Portal Web Project – Web Module dialog

Table 4-6 describes each field of the dialog.

Table 4-6 New Portal Web Project – Web Module Data Fields  

Field

Description

Context Root

The context root of the web application.

Content Directory

The default web content directory name WebContent is automatically displayed; you can change it if you wish.

As a best practice, you should locate your portal file(s) and other portal resources in a web content directory that is subordinate to the web project directory.

Java Source Directory

The default Java source directory name src is automatically displayed; you can change it if you wish.

New Portal Web Project – WebLogic Web Module Dialog

The New Portal Web Project – WebLogic Web Module dialog is shown in Figure 4-7.

Figure 4-7 New Portal Web Project – WebLogic Web Module Dialog

Table 4-7 describes the dialog.

Table 4-7 New Portal Web Project – Weblogic Web Module Dialog Description 

Field

Description

Shared J2EE Libraries selection buttons

If you select the Use Shared J2EE Libraries radio button, Weblogic Portal creates associations with shared J2EE libraries rather than copying the complete set of JAR files into your project. BEA recommends that you use shared J2EE libraries because of their significant advantages in source control, file sharing, and patch application. With any future patch installations, Weblogic Portal supports only configurations that do not have copied J2EE library resources in the project. For more information about shared J2EE libraries, refer to Weblogic Portal and Shared J2EE Libraries.

If you select not to use shared J2EE libraries, all of the necessary JAR files will be copied directly into the project.

 Once You have created the Portal EAR project & Portal WEB project You can integrate the struts application with the portal.

Integrating Struts Applications

You can integrate, or import, a Struts application into an enterprise application in Workshop for WebLogic. Once in Workshop for WebLogic, you can give the Struts application a portal user interface by creating portlets, add personalization and campaign functionality, and take advantage of WebLogic Portal’s content and user management services.

Preparing Your Struts Application for Integration

Follow the guidelines presented in this section as you prepare your existing Struts application for integration with WebLogic Portal:

Refactor

If you have a top-level Struts application, you must refactor it before you can integrate it. Any Struts applications that are intended for use in a portal must be developed as Struts modules, including the usage of the html:link tag for any URLs used in JSPs. Without this, it is impossible for WebLogic Portal to perform the necessary URL rewriting that is required to transparently modify links when the Struts application is used within a portlet.

As part of this process, modify your application to use WebLogic Portal tags using either of these methods:

§  Rely on the taglib mapping in web.xml to map the WebLogic Portal struts adapter tags to the URI that you already have in your JSPs; this allows you to use your existing JSPs.

§  To use Struts 1.2, which is the default version of Struts used for new portal web projects, BEA recommends that you change your JSPs to use WebLogic Portal taglib URIs; this prevents you from having to change your web.xml file, and provides the benefit that these taglibs are automatically deployed.

Add Tags if Needed

If a Struts application used within a portal also needs to support stand-alone operation, JSPs referenced by Action forwards must be authored to use several optional tags in the HTML tag library found in struts.jar and struts-adapter.jar (a file that is created by BEA). The first of these, <html:html>, is found in both Struts and the Struts-adapter. The Struts-adapter version overrides the Struts version of the tag and adds support for detecting whether or not to inhibit rendering of the tag output text if it is used from within a portal, where outputting the HTML text would result in non-well-formed HTML. Two additional tags are provided in the Struts-adapter version of the HTML tag library; use them in JSPs that also need to be used standalone: <html:head> and <html:body>. These two tags have the same portal-aware rendering behavior as the <html:html> tag.

Override Certain Behaviors of a RequestProcessor

Some Struts applications use a custom RequestProcessor. WebLogic Portal Struts integration requires that you override certain behaviors of a RequestProcessor. The class com.bea.struts.adapter.action.AdapterRequestProcessor, located in struts-adapter.jar, provides this standard behavior and must be used in all Struts applications used within a portal. Any custom RequestProcessors must either extend this class or use a utility class to perform the same required operation that this RequestProcessor performs. When extending this class, overrides of doForward() must call the superclass doForward() and also must not attempt to write to the response. Custom RequestProcessors that do not extend AdapterRequestProcessor must call com.bea.struts.adapter.action.AdapterRequestProcessorUtil.forwardUsingRequest() to perform any forwarding operations. (This method replaces an actual RequestDispatcher forward request with an operation that captures the forward URI for later use in including the URI into the portal output.)

Refactor any Existing Custom Action Servlet

If a Struts application depends on the use of a custom Action servlet, it must be refactored to use a custom RequestProcessor instead, as outlined above, and as recommended by the Struts implementation. Since the page flow functionality in WebLogic Portal uses a custom Action servlet, and since there can be only one Action servlet in a portal web project, portal Struts integration requires that the Action servlet not be customized. For more information on refactoring an Action servlet customization into a RequestProcessor customization, see the Struts documentation at http://jakarta.apache.org/struts/.

Remove the <html:link> Tag

The StrutsContent control supports module switching using Action forwards. If the Action forward returned by an invoked Action results in a content URI that resides in another module, the current module is switched to the corresponding new module, and all further requests to the Struts portlet containing the control are performed using the new module. Perform module switching using only Action forwards, not by using the tag to directly link to a JSP in another module; doing so might prevent the portal and Struts frameworks from correctly setting up and selecting the module.

Integration Steps

Once you create the Portal EAR project and Portal WEB project then the directory structure will be like…

C:

 

 

 

 

 

 

 

bea

 

 

 

 

 

 

 

user_projects

 

 

 

 

 

 

 

w4WP_workspaces

 

 

 

 

 

 

 

StrutsApplication

(Workspace Name)

 

 

 

 

 

 

 

EAR

 

 

 

 

 

 

 

 

 

 

 

 

 

WAR

 

 

 

 

 

 

src

 

 

 

 

 

 

Webcontent

 

Now we are ready to integrate the struts application into weblogic portal 10.

We need to do all the modifications in “src” and “Webcontent”  folders exists in the WAR folder.

Copy all the java files into “src” folder. With packaging structure. If class files resides in classes->struts->example foder in normal struts application then in case of struts portlet copy the “.java” files into “src” folder with package “struts->example”.

No need to copy the struts.jar and struts-adapter.jar into WEB-INF->lib because when you create struts portlet those files are automatically included in the project in “Merged Project Content”. Except this files copy all jar files in WEB-INF->lib.

Copy the Struts application module’s struts-config.xml or module configuration file into WEB-INF, but rename it struts-auto-config-<module-path>.xml, where <module-path> is the module path to the Struts application relative to the web application root, with all instances of ‘/’ or ” changed to ‘-‘.

For example, if the module path is /struts/my/module, then rename struts-config.xml to struts-auto-config-struts-my-module.xml. Naming the module configuration file in this manner enables the PageFlowActionServlet used as the Action Servlet to automatically register the module without explicitly registering it with an init-param in web.xml. If you don’t want to take advantage of this functionality, you can rename struts-config.xml arbitrarily, but you must manually register the module in web.xml as usual for a Struts 1.1 or 1.2 (Beehive) module.

In the module configuration file, add the following line to configure the RequestProcessor that is required for portal integration:

(unless the Struts application requires a custom RequestProcessor). (This AdapterRequestProcessor class resides in the struts-adapter.jar which is included in the portal application by weblogic itself automatically so no need to include it explicitly.)

copy all the jsp files in the struts module directory(Which you will create yourself or if you hae name the struts-config.xml file as struts-auto-config-strutsApp.xml then at the time of portlet creation the portlet by right clicking on struts-auto-config-strutsApp.xml file and select “Generate Portlet” , strutsApp folder will be created in the WebContent directory and the portlet which has been generated will be placed in that folder). In the same foder structure you have used in the normal struts application. And give the path of those jsp files into struts-auto-config-strutsApp.xml accordingly.

Use the Portlet Wizard to generate a portlet based on a Struts module, as explained in this section.

To create a Struts portlet, follow these steps:

1.       Right-click the Struts application module’s XML configuration file located in the WEB-INF directory of the portal web application.

2.       Select Generate Portlet from the menu. The wizard automatically collects and displays the module path and configuration file name(s) in the Struts Config File dialog. An example is shown in Figure 5-17. Use the Browse and Add buttons to locate and add additional configuration files, if applicable.

Figure 5-17 Struts Config File Dialog

1.       Click Next.

2.       In the Struts Actions dialog, specify an action for the Struts portlet. The actions that appear in the drop-down menu are based on entries in the configuration file(s) that were added previously.

Figure 5-18 Struts Actions Dialog

1.       Click Create.

The Workshop for WebLogic window updates, adding the Portlet_Name.portlet file to the display tree; by default, Workshop for WebLogic places the portlet file in the directory that you specified in the Struts Module Path dialog of the wizard.

File->new->Portal and then save it.

Now Click on the portlet file in package explorer and drag into the portal window and save it.

Now select the .portal file and then right click on it and select Run on Server from Run menu.

Best Practices and Development Issues

Use the following guidelines for integrating Struts applications in portals:

Advertisements

Tutorial on Hadoop with VMware Player


Tutorial on Hadoop with VMware Player

Map Reduce (Source: google)
Map Reduce (Source: google)

Download Free EBooks and Whitepapers on Big DATA

Functional Programming
According to WIKI, In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state. Since there is no hidden dependency (via shared state), functions in the DAG can run anywhere in parallel as long as one is not an ancestor of the other. In other words, analyze the parallelism is much easier when there is no hidden dependency from shared state. Map/reduce is a special form of such a directed acyclic graph which is applicable in a wide range of use cases. It is organized as a “map” function which transform a piece of data into some number of key/value pairs. Each of these elements will then be sorted by their key and reach to the same node, where a “reduce” function is use to merge the values (of the same key) into a single result.
Map Reduce

A way to take a big task and divide it into discrete tasks that can be done in parallel. Map / Reduce is just a pair of functions, operating over a list of data.

MapReduce is a patented software framework introduced by Google to support distributed computing on large data sets on clusters of computers.

The framework is inspired by map and reduce functions commonly used in functional programming,[3] although their purpose in the MapReduce framework is not the same as their original forms.
Hadoop
A Large scale Batch Data Processing System.

It uses MAP-REDUCE for computation and HDFS for storage.

Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s MapReduce and Google File System (GFS) papers.

It is a framework written in Java for running applications on large clusters of commodity hardware and incorporates features similar to those of the Google File System and of MapReduce. HDFS is a highly fault-tolerant distributed file system and like Hadoop designed to be deployed on low-cost hardware. It provides high throughput access to application data and is suitable for applications that have large data sets.

Hadoop is an open source Java implementation of Google’s MapReduce algorithm along with an infrastructure to support distributing it over multiple machines. This includes it’s own filesystem ( HDFS Hadoop Distributed File System based on the Google File System) which is specifically tailored for dealing with large files. When thinking about Hadoop it’s important to keep in mind that the infrastructure it has is a huge part of it. Implementing MapReduce is simple. Implementing a system that can intelligently manage the distribution of processing and your files, and breaking those files down into more manageable chunks for processing in an efficient way is not.

HDFS breaks files down into blocks which can be replicated across it’s network (how many times it’s replicated it determined by your application and can be specified on a per file basis). This is one of the most important performance features and, according to the docs “…is a feature that needs a lot of tuning and experience.” You really don’t want to have 50 machines all trying to pull from a 1TB file on a single data node, at the same time, but you also don’t want to have it replicate a 1TB file out to 50 machines. So, it’s a balancing act.

Hadoop installations are broken into three types.

v  The NameNode acts as the HDFS master, managing all decisions regarding data replication.

v  The JobTracker manages the MapReduce work. It “…is the central location for submitting and tracking MR jobs in a network environment.”

v  Task Tracker and Data Node, which do the grunt work

Hadoop - NameNode, DataNode, JobTracker, TaskTracker
Hadoop – NameNode, DataNode, JobTracker, TaskTracker

The JobTracker will first determine the number of splits (each split is configurable, ~16-64MB) from the input path, and select some TaskTracker based on their network proximity to the data sources, then the JobTracker send the task requests to those selected TaskTrackers.

Each TaskTracker will start the map phase processing by extracting the input data from the splits. For each record parsed by the “InputFormat”, it invoke the user provided “map” function, which emits a number of key/value pair in the memory buffer. A periodic wakeup process will sort the memory buffer into different reducer node by invoke the “combine” function. The key/value pairs are sorted into one of the R local files (suppose there are R reducer nodes).

When the map task completes (all splits are done), the TaskTracker will notify the JobTracker. When all the TaskTrackers are done, the JobTracker will notify the selected TaskTrackers for the reduce phase.

Each TaskTracker will read the region files remotely. It sorts the key/value pairs and for each key, it invoke the “reduce” function, which collects the key/aggregatedValue into the output file (one per reducer node).

Map/Reduce framework is resilient to crash of any components. The JobTracker keep tracks of the progress of each phases and periodically ping the TaskTracker for their health status. When any of the map phase TaskTracker crashes, the JobTracker will reassign the map task to a different TaskTracker node, which will rerun all the assigned splits. If the reduce phase TaskTracker crashes, the JobTracker will rerun the reduce at a different TaskTracker.
Let’s try Hands on Hadoop
Objective of the tutorial is to set up multi-node Hadoop cluster using the Hadoop Distributed File System (HDFS) on Ubuntu Linux with the use of VMware Player.

Hadoop and VMware Player
Hadoop and VMware Player

Installations / Configurations Needed:

Laptop

Physical Machine

Laptop with 60 GB HDD, 2 GB RAM, 32bit Support, OS – Ubuntu 10.04 LTS – the Lucid Lynx

IP Address-192.168.1.3 [Used in configuration files]

Virtual Machine

See VMware Player sub section

Download Ubuntu ISO file

Ubuntu 10.04 LTS – the Lucid Lynx ISO file is needed to install on virtual machine created by VMware Player to set up multi-node Hadoop cluster.

Download Ubuntu Desktop Edition
Download Ubuntu Desktop Edition

http://www.ubuntu.com/desktop/get-ubuntu/download

Note: Login with user “root” to avoid any kind of permission issues (In your machine and Virtual Machine).

Update the Ubuntu packages: sudo apt-get update

VMware Player [Freeware]

Download it from http://downloads.vmware.com/d/info/desktop_downloads/vmware_player/3_0

Download VMware Player
Download VMware Player
Select VMware Player to Download
Select VMware Player to Download
VMware Player Free Product Download
VMware Player Free Product Download

Install VMware Player on your physical machine with the use of the downloaded bundle.

VMware Player - Ready to install
VMware Player – Ready to install
VMware Player - installing
VMware Player – installing

Now, create virtual machine with the use of it and install Ubuntu 10.04 LTS on it with the use of ISO file and do appropriate configurations for the virtual machine.

Browse Ubuntu ISO
Browse Ubuntu ISO

Proceed with instructions and let the set up finish.

Virtual Machine in VMware Player
Virtual Machine in VMware Player

Once you are done with it successfully*, Select Play virtual Machine.

Start Virtual Machine in VMware Player
Start Virtual Machine in VMware Player

Open Terminal (Command prompt in Ubuntu) and check the IP address of the Virtual Machine.

NOTE: IP address may change so if Virtual machine cannot be connected by SSH from physical machine then have a look on IP address 1st.

Ubuntu Virtual Machine - ifconfig
Ubuntu Virtual Machine – ifconfig

Apply following configuration in physical & virtual machine for Java 6 and Hadoop installation only.

Installing Java 6

sudo apt-get install sun-java6-jdk

sudo update-java-alternatives -s java-6-sun [Verify Java Version]

Setting up Hadoop  0.20.2

Download Hadoop from http://www.apache.org/dyn/closer.cgi/hadoop/core and place under /usr/local/hadoop

HADOOP Configurations

Hadoop requires SSH access to manage its nodes, i.e. remote machines [In our case virtual Machine] plus your local machine if you want to use Hadoop on it.

On Physical Machine

Generate an SSH key

Generate an SSH key
Generate an SSH key

Enable SSH access to your local machine with this newly created key.

Enable SSH access to your local machine
Enable SSH access to your local machine

Or you can copy it from $HOME/.ssh/id_rsa.pub to $HOME/.ssh/authorized_keys manually.

Test the SSH setup by connecting to your local machine with the root  user.

Test the SSH setup
Test the SSH setup

Use ssh 192.168.1.3 from physical machine as well. It will give same result.

On Virtual Machine

The root user account on the slave (Virtual Machine) should be able to access physical machine via a password-less SSH login.

Add the Physical Machine’s public SSH key (which should be in ) to the authorized_keys file of Vitual Machine (in this user’s ). You can do this manually

(Physical Machine)$HOME/.ssh/id_rsa.pub -> (VM)$HOME/.ssh/authorized_keys

SSH Key may look like (Can’t be same though J)

ssh

rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAwjhqJ7MyXGnn5Ly+0iOwnHETAR6Y3Lh3UUKb

aCIP2/0FsVOWhBvcSLMEgT1ewrRPKk9IGoegMCMdHDGDfabzO4tUsfCdfvvb9KFRcB

U3pKdq+yVvCVxXtoD7lNnMtckUwSz5F1d04Z+MDPbDixn6IAu/GeX9aE2mrJRBq1Pz

n3iB4GpjnSPoLwQvEO835EMchq4AI92+glrySptpx2MGporxs5LvDaX87yMsPyF5tutu

Q+WwRiLfAW34OfrYsZ/Iqdak5agE51vlV/SESYJ7OqdD3+aTQghlmPYE4ILivCsqc7w

xT+XtPwR1B9jpOSkpvjOknPgZ0wNi8LD5zyEQ3w== root@mitesh-laptop

Use ssh 192.168.1.3 from virtual machine to verify ssh access and have a feel of it to understand ssh working.

For more understanding, Ping 192.168.1.3 and 192.168.28.136 from each other.

For detail information on Network Settings in VMWare Player visit http://www.vmware.com/support/ws55/doc/ws_net_configurations_common.html VMware Player has similar concepts.

Using 0.0.0.0 for the various networking-related Hadoop configuration options will result in Hadoop binding to the IPv6 addresses of Ubuntu box.

To disable IPv6 on Ubuntu 10.04 LTS, open /etc/sysctl.conf in the editor of your choice and add the following lines to the end of the file:

#disable ipv6

net.ipv6.conf.all.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.lo.disable_ipv6 = 1

Ubuntu - Disable IPv6
Ubuntu – Disable IPv6

 <HADOOP_INSTALL>/conf/hadoop-env.sh -> set the JAVA_HOME environment variable to the Sun JDK/JRE 6 directory.

 

# The java implementation to use.  Required.

export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.20

 

<HADOOP_INSTALL>/conf/core-site.xml ->

 

Configure the directory where Hadoop will store its data files, the network ports it listens to, etc. Our setup will use Hadoop’s Distributed File System,

Hadoop - core-site.xml
Hadoop – core-site.xml

HDFS, even though our little “cluster” only contains our single local machine.

<property>

  hadoop.tmp.dir

  /usr/local/hadoop/tmp/dir/hadoop-${user.name}

</property>

 <HADOOP_INSTALL>/conf/mapred-site.xml ->

<property>

  <name>mapred.job.tracker</name>

  <value>192.168.1.3:54311</value>

</property>

Hadoop - mapred-site.xml
Hadoop – mapred-site.xml

 <HADOOP_INSTALL>/conf/hdfs-site.xml

 

<property>

  <name>dfs.replication</name>

  <value>2</value>

</property>

Physical Machine vs Virtual Machine (Master/Slave) Settings on Physical Machine only

<HADOOP_INSTALL>/conf/masters

The conf/masters file defines the namenodes of our multi-node cluster. In our case, this is just the master machine.

192.168.1.3

<HADOOP_INSTALL>/conf/slaves

 This conf/slaves file lists the hosts, one per line, where the Hadoop slave daemons (datanodes and tasktrackers) will be run. We want both the master box and the slave box to act as Hadoop slaves because we want both of them to store and process data.

192.168.1.3

192.168.28.136

NOTE: Here 192.168.1.3 & 192.168.28.136 are the IP addresses of Physical Machine and Virtual machine respectively which may vary in your case. Just Enter IP Addresses in files and you are done!!!

Let’s enjoy the ride with Hadoop:

All Set for having “HANDS ON HADOOP”.

Formatting the name node

ON Physical Machine and Virtual Machine

The first step to starting up your Hadoop installation is formatting the Hadoop filesystem which is implemented on top of the local filesystem of your “cluster” (which includes only your local machine if you followed this tutorial). You need to do this the first time you set up a Hadoop cluster. Do not format a running Hadoop filesystem, this will cause all your data to be erased.

hadoop namenode -format
hadoop namenode -format

Starting the multi-node cluster

1.    Start HDFS daemons

Run the command /bin/start-dfs.sh on the machine you want the (primary) namenode to run on. This will bring up HDFS with the namenode running on the machine you ran the previous command on, and datanodes on the machines listed in the conf/slaves file.

Physical Machine

Hadoop - start-dfs.sh
Hadoop – start-dfs.sh

VM

Hadoop - DataNode on Slave Machine
Hadoop – DataNode on Slave Machine

1.    Start MapReduce daemons

Run the command /bin/start-mapred.sh on the machine you want the jobtracker to run on. This will bring up the MapReduce cluster with the jobtracker running on the machine you ran the previous command on, and tasktrackers on the machines listed in the conf/slaves file.

Physical Machine

Hadoop - Start MapReduce daemons
Hadoop – Start MapReduce daemons

VM

TaskTracker in Hadoop
TaskTracker in Hadoop

Running a MapReduce job

Here’s the example input data I have used for the multi-node cluster setup described in this tutorial.

All ebooks should be in plain text us-ascii encoding.

http://www.gutenberg.org/etext/20417

http://www.gutenberg.org/etext/5000

http://www.gutenberg.org/etext/4300

http://www.gutenberg.org/etext/132

http://www.gutenberg.org/etext/1661

http://www.gutenberg.org/etext/972

http://www.gutenberg.org/etext/19699

Download above ebooks and store it in local file system.

Copy local example data to HDFS

Hadoop - Copy local example data to HDFS
Hadoop – Copy local example data to HDFS

Run the MapReduce job

hadoop-0.20.2/bin/hadoop jar hadoop-0.20.2-examples.jar wordcount examples example-output

Failed Hadoop Job
Failed Hadoop Job

Retrieve the job result from HDFS

To read the file directly from HDFS without copying it to the local file system. In this tutorial, we will copy the results to the local file system though.

mkdir /tmp/example-output-final

bin/hadoop dfs -getmerge example-output-final /tmp/ example-output-final

Hadoop - Word count example
Hadoop – Word count example

Hadoop - MapReduce Administration
Hadoop – MapReduce Administration
Hadoop - Running and Completed Job
Hadoop – Running and Completed Job

Task Tracker Web Interface

Hadoop - Task Tracker Web Interface
Hadoop – Task Tracker Web Interface

Hadoop - NameNode Cluster Summary
Hadoop – NameNode Cluster Summary

References

http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)

http://www.michael-noll.com/wiki/Writing_An_Hadoop_MapReduce_Program_In_Python

http://java.dzone.com/articles/how-hadoop-mapreduce-works

http://ayende.com/Blog/archive/2010/03/14/map-reduce-ndash-a-visual-explanation.aspx

http://www.youtube.com/watch?v=Aq0x2z69syM

http://www.gridgainsystems.com/wiki/display/GG15UG/MapReduce+Overview

http://map-reduce.wikispaces.asu.edu/

http://blogs.sun.com/fifors/entry/map_reduce

http://www.vmware.com/support/ws55/doc/ws_net_configurations_common.html

http://www.ibm.com/developerworks/aix/library/au-cloud_apache/

 

ssh

rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAwjhqJ7MyXGnn5Ly+0iOwnHETAR6Y3Lh3UUKb

aCIP2/0FsVOWhBvcSLMEgT1ewrRPKk9IGoegMCMdHDGDfabzO4tUsfCdfvvb9KFRcB

U3pKdq+yVvCVxXtoD7lNnMtckUwSz5F1d04Z+MDPbDixn6IAu/GeX9aE2mrJRBq1Pz

n3iB4GpjnSPoLwQvEO835EMchq4AI92+glrySptpx2MGporxs5LvDaX87yMsPyF5tutu

Q+WwRiLfAW34OfrYsZ/Iqdak5agE51vlV/SESYJ7OqdD3+aTQghlmPYE4ILivCsqc7w

xT+XtPwR1B9jpOSkpvjOknPgZ0wNi8LD5zyEQ3w== root@mitesh-laptop

Verticle and Horizontal Markets


Verticle and Horizontal Markets or Verticles and Horizontals

A vertical market (often referred to simply as a “vertical”) is a group of similar businesses and customers that engage in trade based on specific and specialized needs.

Cloud Computing – Download Free EBooks and Whitepapers

Some common examples of vertical markets:

A horizontal market is a market which meets a given need of a wide variety of industries, rather than a specific one.

In technology, horizontal markets consist of customers that share a common need that exists in many or all (vertical) industries. For example, customers that need to purchase computer security services or software exist in such varied industries as finance, healthcare, government, etc. Together, these customers constitute the security horizontal market. Other examples of horizontal markets include computer storage, accounting, desktop graphics, computer-aided design, sales force automation, and human resources.

Comparison with horizontal markets

A vertical market is a market which meets the needs of a particular industry: for example, a piece of equipment used only by semiconductor manufacturers. It is also known as a niche market.[1]

A horizontal market is a market which meets a given need of a wide variety of industries, rather than a specific one: for example, word processing software[2] or machine-to-machine technologies. On occasion a horizontal market can be highly specialized at the same time, for instance consider the market for converting mainframe databases developed with the tools of a particular company. The market for such a database conversion service might involve many vertical markets (aerospace, manufacturing, banking, etc) but nevertheless be labelled “niche” if the database technology is technically obsolete and no longer mainstream.

References:

http://en.wikipedia.org/wiki/Vertical_market

http://en.wikipedia.org/wiki/Horizontal_market

How to Configure CloudAnalyst in Eclipse


Configure CloudAnalyst in Eclipse

Create New Java Project

Java – Download Free EBooks and Whitepapers

New Java Project in Eclipse
New Java Project in Eclipse
Create Java Project in Eclipse
Create Java Project in Eclipse
New Java Project in Eclipse: Java Settings
New Java Project in Eclipse: Java Settings

Java™ Application Development on Linux® – Free 599 Page eBook

Enterprise Java Virtualization:

Understanding the TCO Implications

InfoWorld’s Java IDE Comparison Strategy Guide:

Java Essential Training

Apache Jakarta Commons: Reusable Java™ Components

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Go to File->Import

New Java Project in Eclipse: Import Source Code from Existing Project
New Java Project in Eclipse: Import Source Code from Existing Project
New Java Project in Eclipse: Import resources from Local File System
New Java Project in Eclipse: Import resources from Local File System
Run CloudAnalyst in Eclipse
Run CloudAnalyst in Eclipse

Done!!!

CloudAnalyst GUI
CloudAnalyst GUI

Single Sign-On for Java and Web Applications

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Transforming a Generic Java IDE to Your Application Specific IDE:

The Java Virtual Appliance—No OS Required

BEA WebLogic® Operations Control: Application Virtualization for Enterprise Java

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

 

 

How to Run and Configure CloudAnalyst


We have moved this post to http://etutorialsworld.com/2016/03/how-to-run-and-configure-cloudanalyst/

TROVE – High Performance Collections for Java


TROVE is a Fast, lightweight implementations of the java.util Collections API. These implementations are designed to be pluggable replacements for their JDK equivalents.

Java – Download Free EBooks and Whitepapers

Gap in Java

Collections in Java accept only reference types as its element, not primitive datatypes. When trying to do so it produces a compile time error.  In java when we want to store primitive data types in collections we need to use wrapper classes.

All collection classes of java store memory location of the objects they collect. The primitive values do not fit in to the same definition.

To circumvent this problem, JDK5 and onwards have autoboxing – wherein the primitives are converted to appropriate objects and back when they are added or read from the collections.

Using java.util.HashMap, it is not possible to use Java language arrays as keys. For example, this code:

char[] foo, bar;
foo = new char[] {‘a’,’b’,’c’};
bar = new char[] {‘a’,’b’,’c’};
System.out.println(foo.hashCode() == bar.hashCode() ? “equal” : “not equal”);
System.out.println(foo.equals(bar) ? “equal” : “not equal”);
produces this output:

not equal
not equal

And so an entry stored in a java.util.HashMap with foo as a key could not be retrieved with bar, since there is no way to override hashCode() or equals() on language array objects.

Java™ Application Development on Linux® – Free 599 Page eBook

Enterprise Java Virtualization:

Understanding the TCO Implications

InfoWorld’s Java IDE Comparison Strategy Guide:

Java Essential Training

Apache Jakarta Commons: Reusable Java™ Components

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Trove comes to Rescue

Trove provides collections for primitive types. Collections which store primitives directly will require less space and yield significant performance gains.

The Trove maps/sets use open addressing instead of the chaining approach taken by the JDK hashtables.

What is Open Addressing?

Open addressing, or closed hashing, is a method of collision resolution in hash tables. With this method a hash collision is resolved by probing, or searching through alternate locations in the array (the probe sequence) until either the target record is found, or an unused array slot is found, which indicates that there is no such key in the table.

Open Addressing vs. Chaining

  Chaining Open addressing
Collision resolution Using external data structure Using hash table itself
Memory waste Pointer size overhead per entry (storing list heads in the table) No overhead
Performance dependence on table’s load factor Directly proportional Proportional to (loadFactor) / (1 – loadFactor)
Allow to store more items, than hash table size Yes No. Moreover, it’s recommended to keep table’s load factor below 0.7
Hash function requirements Uniform distribution Uniform distribution, should avoid clustering
Handle removals Removals are ok Removals clog the hash table with “DELETED” entries
Implementation Simple Correct implementation of open addressing based hash table is quite tricky

The size of the tables used in Trove’s maps/sets is always a prime number, improving the probability of an optimal distribution of entries across the table, and so reducing the likelihood of performance-degrading collisions. Trove sets are not backed by maps, and so using a THashSet does not result in the allocation of an unused “values” array.

In a gnu.trove.THashMap, however, you can implement a TObjectHashingStrategy to enable hashing on arrays:

class CharArrayStrategy implements TObjectHashingStrategy {
public int computeHashCode(Object o) {
char[] c = (char[])o;
// use the shift-add-xor class of string hashing functions
// cf. Ramakrishna and Zobel, “Performance in Practice
// of String Hashing Functions”
int h = 31; // seed chosen at random
for (int i = 0; i < c.length; i++) { // could skip invariants
h = h ^ ((h << 5) + (h >> 2) + c[i]); // L=5, R=2 works well for ASCII input
}
return h;
}

public boolean equals(Object o1, Object o2) {
char[] c1 = (char[])o1;
char[] c2 = (char[])o2;
if (c1.length != c2.length) { // could drop this check for fixed-length keys
return false;
}
for (int i = 0, len = c1.length; i < len; i++) { // could skip invariants
if (c1[i] != c2[i]) {
return false;
}
}
return true;
}
}

Single Sign-On for Java and Web Applications

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Transforming a Generic Java IDE to Your Application Specific IDE:

The Java Virtual Appliance—No OS Required

BEA WebLogic® Operations Control: Application Virtualization for Enterprise Java

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

References:

http://trove.starlight-systems.com/

http://en.wikipedia.org/wiki/Open_addressing

How ANT works?


How ANT works?

Each Project will have a build file (build.xml)

Each build file will contain one or more Targets

Java – Download Free EBooks and Whitepapers

The Target to be executed:

                Is either explicitly selected on the command line

                Or a project default Target is executed

Java™ Application Development on Linux® – Free 599 Page eBook

Enterprise Java Virtualization:

Understanding the TCO Implications

InfoWorld’s Java IDE Comparison Strategy Guide:

Java Essential Training

Apache Jakarta Commons: Reusable Java™ Components

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Each Target is executed only once.

Project

is the top level element in an Ant script

<project> has three optional attributes:

                name: the name of the project

                default: the default target to use when no target is supplied

                basedir: the base directory from which all path calculations are done

<project name=“MyProject” default=“compile”>

                <!–- properties and targets will come here…–>

</project>

Targets

             Each project defines zero or more targets

             A target is a set of tasks you want to be executed

             When starting Ant, you can select which target(s) you want to have executed

             When no target is given, the project’s default is used

             Targets can be conditionally executed (using if/unless)

             A target can depend on other targets

             Target dependencies are transitive

             Each Target will contain one or more Tasks

             Some Tasks are executed conditionally

             Tasks are implemented as Java classes

<project name=“MyProject” default=“compile”>

                <property name=”buildDir” value=”build”/>

                <property name=“srcDir” value=“.”/> 

                <target name=”compile”>

                                <!–Tasks will come here…–>

                </target>

</project>

Tasks

             A task is a piece of code that can be executed

             A task can have multiple attributes (a.k.a arguments)

             The value of an attribute might use the value of a property.

             Ant comes with over 80 core tasks, and 60 optional tasks

             Ant task extensions can be easily written for any unique problem

<project name=“MyProject” default=“compile”>

                <property name=”buildDir” value=”build”/>

                <property name=“srcDir” value=“.”/> 

                <target name=”compile”>

                                <javac srcdir=”${srcDir}”               destdir=”${build}”/>

                </target>

</project>

Single Sign-On for Java and Web Applications

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Transforming a Generic Java IDE to Your Application Specific IDE:

The Java Virtual Appliance—No OS Required

BEA WebLogic® Operations Control: Application Virtualization for Enterprise Java

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications: