Sunday, December 30, 2012

Apache Maven (II) - Project Object Model

This is my second post about Apache Maven. In my last post, Apache Maven (i), we saw how to install Maven and build a minimal project. Today, we are going to see what is the Project Object Model (POM).

POM is a descriptor of a Maven project. That project may be formed from one  xml file (pom.xml) to several scattered around the its own directory tree.  In each 'pom.xml', you will provide all required information to describe what the project is about.

On this post we will see only a part of POM, but you can get a full guide with all the elements and its descriptions visiting the POM page at the official Maven website: http://maven.apache.org/pom.html

POM elements


The root element of 'pom.xml' is 'project'

<project xmlns="http://maven.apache.org/POM/4.0.0" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://maven.apache.org/POM/4.0.0                      
                           http://maven.apache.org/xsd/maven-4.0.0.xsd"> 
  <modelVersion>4.0.0</modelVersion> 
...

</project>

Don't be worried about the attributes of 'project' element. They are used to define the namespace and the location of schema for validation. Just try to keep it like the example.

The element 'modelVersion' is used to define which version of project object model is being used. The 4.0.0 is accepted by Maven 2 and 3.

Essential identity elements

There are three elements that identify a project: 'groupId', 'artifactId' and 'version'.  These element values will be used to generate the output package (artifact) and to place it on a specific path on the repository.  That repository usually is on '~/.m2/repository'. From now, we will refer to that path as '$M2_REPO'.
  • groupId: Generally, this element is the identifier of the company / organization / group that creates the project. The 'groupId' can use "dot notation", but is not necessary. We can find 'groupId's like "com.mybussiness.maindepartment" or just "myfantasticbussiness". All the artifacts under this 'groupId' will be under subdirectories placed in a directory named as the value of that 'groupId'. In case of being a doted string value, Maven will create a subdirectory for each element of the 'groupId'. In the case above "com.mybussines.mydeparment" you will get a the project output under "$M2_REPO/com/mybussiness/mydepartment".
  • artifactId  It indicates the unique base name of the primary artifact that will be generated by this project. An artifact could be a jar, war, ear, ... Its value is generally the name of the project is known by. In our repository each artifact will generate a directory named as its value inside its 'groupId' output directory. 
  • version It indicates the version of the artifact generated by the project. The repository can store multiple versions of the artifact. For each version you will find a directory with the artifacts inside.
An artifact generated by Maven would have the following name structure: <artifactid>-<version>.<extension> (for example, myfirstapp-1.0.jar).

If you are using inheritance mechanism (we will it see below), you are not required to define 'groupId' and 'version' elements explicitly.

Other basic elements

There are other elements that although not required, are highly recommended to be supplied.
  • packaging:  It is the project's artifact type. Maven core provides the following types: pom, jar, maven-plugin, ejb, war, ear, rar and par. That list can be extended by some plugins.
  • name: This element indicates the display name used for the project.
  • url: The location where the project's site can be found.
  • description: A basic description of your project.

Build process

There is another element that is very important in the POM. Although is not required to be declared, is necessary to know what its mission is, and how you can set it up. It's the element 'build'. In that element we can configure all about the build process.

'build' element could be find in two different scopes. First one is the project scope. 'build' element is declared as a child of 'project' element, and its parameters are applied to all projects/subprojects that 'pom.xml' covers. The other one is behind a profile (we will work with profiles soon on other posts). In you project you can define various profiles and activate its settings depending on some parameters. 'build' element could be declared or redeclared in each 'profile' element  to modify the behavior of our project depending on the profiles that has been chosen on build time.

<project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.mycompany.mydepartment</groupId>
  <artifactId>myfirstapp</artifactId>
  <version>1.0</version>

  <build>
    <!-- Here, you can set up you build element -->
  </build>

  <profiles>
    <profile>

      <build>
        <!-- Here you can set up you build element -->
      </build>
    </profile>
  </profiles>
</project>

Let's see which are the most frequently used elements:
  • defaultGoal: The default goal to execute if it's not provided by command line.
  • directory: This is the directory that Maven will use to dump all the output data.  The default value is '${basedir}/target'. It indicates that the output directory is a directory named 'target' in our project directory.
  • finalName: This would be the name of the bundled project . By default is set to ${artifactId}-${version}.
  • filter: It defines a set of properties files (ex sample.properties), that contains a list of key=value list, that will be applied to the resources defined into 'resources' element that has his filtering value to true.
  • resources: That element contains a set of elements 'resource' that describe files associated with the project but not containing code. Each 'resource' element configure set of files through  'include' and 'exclude' elements and regular expressions to determine with files of which directories are included and which not.
  • plugins: That element contains a set of elements 'plugin' that describe a Maven plugin to be used in the build process. We find a lot of variety of plugins. Essentially, in each 'plugin' element you have to indicate which plugin to use indentifing it with its 'groupId', 'artifactId' and 'version'. Then you have to configure them (each plugin has its own configuration parameters) and optionaly attach a goal that plugin provides to an specific phase of the build life cicle (I also will talk about this in the following post).
  • pluginManagement:  It also has a 'plugins' element as a child. The main diference between the 'plugins' element under 'build' element or 'pluginsManagement' remains in that all configuration that we find in pluginManagement is not executed directly by Maven. Each 'plugin' element description in the pluginManagement is applied to the corresponding plugin described in the 'build' plugins 'list'. The advantage of configure plugins thought pluginManagement is that all these descriptions can be inherited by child subprojects.

Reporting your project

At same time you use Maven to build your project, you can use it to publish all information you need. When you execute the command 'mvn site', you are telling to Maven to generate all reports about your project. There are a lot of plugins that provides extra information like generating javadoc, check code style, find for bugs, etc. You can set up your site generation using the element 'reporting'.

As you saw in 'build' element, you can use it in two different scopes, too.  You can find it as child of 'project' element or into a particular 'profile'.

<project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.mycompany.mydepartment</groupId>
  <artifactId>myfirstapp</artifactId>
  <version>1.0</version>

   ...

  <reporting>
    <!-- Here, you can set up you reporting element -->
  </reporting>

  <profiles>
    <profile>

      <reporting>
        <!-- Here you can set up you reporting element -->
      </reporting>
    </profile>
  </profiles>
</project>


We will pay more attention when we talk about plugins configuration on the following posts.

Dependencies

It's time to introduce one of the most important skills of Maven. It's a mechanism to solve the problem about managing and handling library dependencies in our project. I will explain in more detail on following posts, but is important to know that we can use a 'dependencies' element to include in our project all the libraries that you need to compile, run or test your app.

<project>

    ... 

 <dependencies>
     <dependency>
       <groupId>org.apache.commons</groupId>
       <artifactId>commons-io</artifactId>
       <version>1.3.2</version>
    </profile>
    <dependency>
       <groupId>org.apache.commons</groupId>
       <artifactId>commons-math</artifactId>
       <version>3.0</version>
    </dependency>
  </dependencies>
</project> 

You may provide the 'artifactId', 'groupId' and 'version' of each dependency 'element' you need in your project.  You can optionally provide the scope of this dependency (compile, test, runtime ...) and type.

All these dependencies will be handled by Maven to be used in your project. It will obtain the jar from remote repository; it will install them in your local repository and it will include them on your output artifact, depending on the packaging type of your project.

The element 'dependencies' as direct child of 'project' element will perform all these actions for you. But, we can also find 'dependencies' element inside a 'dependencyManagement' element. It allows developer to centralize the configuration of dependencies (defining versions, scopes, ...), for using them later into project's 'dependencies' element.

The main advantage of this technique is that the information provided on 'dependencyManagement' could by inherited by other subprojects, and lets you  not repeating these parameters in each subproject file. 

Inheritance

One of the concepts that reduces waste in build management when you are using Maven is the inheritance between projects. In Maven, inheritance is easy defined in the POM. Inheritance concept is similar to the object oriented languages. A "parent" project, can bring its configuration to its children, if they define that they inherit from it. The parent package should be 'pom'. The elements that could be inherited by children projects are:
  • dependencies information
  • developers and contributors
  • reports lists
  • plugin lists
  • plugin executions with matching ids
  • plugin configuration
To make a project to inherit from other you should define a 'parent' element  with the identification elements of the parent.
On the following sample, you can see how 'mysubprojectapp' inherits from 'myfirstapp'.

<project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  
  <modelVersion>4.0.0</modelVersion>
  <artifactId>mysubprojectapp</artifactId>

  <parent>
    <groupId>com.mycompany.mydepartment</groupId>
    <artifactId>myfirstapp</artifactId>
    <version>1.0</version>
    <relativePath>../pom.xml</relativePath>
  </parent>

</project>

You will notice that child project does not declare 'groupId' or 'version'. It is also inherited from parent. If you want to override, you just have to define them.

The fact is that inheritance is implicitly applied in all poms you write. All POMs inherits from a Super POM. All default values and all the configurations that you don't write, are applied from Super POM. You can take a look at http://maven.apache.org/ref/3.0.4/maven-model-builder/super-pom.html

Aggregation

Maven provides a mechanism to divide the project into modules to get more atomicity and help us to get lower coupling in our project designs.
A project that uses aggregation is also known as multimodule project. A pom packaged project could declare a list of 'module' element inside a 'modules' element .  Each 'module' value should be the relative path to the directory that allocates the 'pom.xml' file of that module.

<project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                      http://maven.apache.org/xsd/maven-4.0.0.xsd">
  
  <modelVersion>4.0.0</modelVersion>
  <artifactId>myfirstapp</artifactId>
  <gruoupId>com.mycompany.mydepartment</groupId>
  <packaging>pom</packaging>
  <version>1.0</version>


  <modules>
    <module>api</module>
    <module>impl</module>
    <module>app</module>
  </modules>

</project>

You don't have to take care about the order of the modules, or if they have inter-dependencies between each module. If you described in each pom the dependency, Maven will know which one to build first.

Usually, projects combines aggregation with inheritance to handle all project modules. It's easy to describe versions and scopes in parent node and also use it to agreggate all the children. That way, you can call the build process in all your modules once.

It's also possible to have more than one level of inheritance and agregation. You usually will find an scenario like this:









my-project-pom 
   type: pom 
my-module-1 
   type:pom
my-module1-api
  type: jar 

my-module1-impl
  type: jar 

my-webapp1 
   type: war




my-module-2 
   type:pom

my-module2-api
   type: jar

my-module2-impl
   type: jar

my-webapp2 
   type: war 




Other information element.


You can use more elements to define the project and its environment. For instance: Organization, Licenses, Developers,Contributors, Repositories, Plugin Repositories, Distribution Management, SCM, Issue Management, Continuous Integration Management, Mailing Lists and Prerequisites.


We will cover some of these elements on the next posts to automatize some task such releasing code, publishing artifacts, programming automatic builds, etc.

Sunday, December 23, 2012

Apache Maven (I)

Today I'm gonna talk about Apache Maven. Essentially, it is a framework to automate build process of Java based projects.  I choose to start with this tool because in my opinion,it's the base for a healthy project development and code management.

Maven was born as an attempt to simplify the build process in the Jakarta Turbine project. Turbine used Ant tools on its build process. Projects managers realized that they had to handle with different build descriptors that generated a lot of JARs. Finally, these JARs were loaded into CVS. They developed a tool to help to standardize all these build descriptors using a clear definition of what the project consisted of, and provided a way to publish project information. Last, they also supplied a mechanism to distribute and share the output packages (jars,wars, ...), to be used in other projects.

That project was successful, and now it's on its 3.0.4 version. It can be used to build any project written in Java. It's a very mature product with an awesome support, and a lot of extensions that make very easy to manage all the build process. Developers around the world adopted this tool in his day-to-day work.

 

Maven philosophy

Before to start working with Maven, I would like to stop to reflect which are its principal goals. We should understand that the main reason of Maven existence is helping developers to reduce waste on the build process. It achieves that goal making the build process easy using project descriptors instead a set of tasks. It employs an uniform build system along all the source code tree. All these descriptors provide a high quality and valuable information about its structure, build workflow and dependencies. Using Maven implicitly implies to work using development best practices guidelines, such using standardized paths names,  project structure and information.

 

Start working

My intention in this post is to describe the steps to install and build a minimal project. I admit that this example seems not to contribute in a very effective way to reduce waste on building process, but you will realize of that reduction as you go along the following blog posts.

Install Java development kit

These steps can be reproduced to install Maven on a Linux system, but are very similar in other OS. First of all, we must install Java Development Kit. You can download it from Oracle Java webpage:


It's important to know which architecture implements your OS before downloading an specific version. If you don't know your version, you can execute in a terminal the command 'uname -a'

In my case, I'm runing a Fedora 16 x64, then I choose 'jdk-7u10-linux-x64.tar.gz'

Once you have it, you can decompres in a directory.

cd ~/Downloads
mkdir -p ~/javaprojects/bin
tar -C ~/javaprojects/bin -z -xvf jdk-7u10-linux-x64.tar.gr

Install Maven


Choose the latest version in tar.gz format. Now is the 3.0.4.
cd ~/Downloads
tar  -C ~/javaprojects/bin -z -xvf apache-maven-bin-3.0.4.tar.gz

Add the following lines to your ~/.bashrc file:

export JAVA_HOME=~/javaprojects/bin/
export MAVEN_HOME=~/javaprojects/bin
export PATH=$JAVA_HOME/bin:$MAVEN_HOME/bin:$PATH

Now, open a new terminal an execute the command:

mvn -version

You should see an output similar to this:

Maven home: /home/user/javaprojects/bin/apache-maven-3.0.4
Java version: 1.7.0_10, vendor: Oracle Corporation
Java home: /home/user/javaprojects/bin/jdk1.7.0_10/jre
Default locale: ca_ES, platform encoding: UTF-8
OS name: "linux", version: "2.6.43.8-1.fc15.i686", arch: "i386", family: "unix"

Create the first project

As many other tutorials, we will start implementing a simple Java program that will print "Hello world" to the standard output. Let's start creating a directory to your new project.

mkdir ~/javaprojects/helloworld
cd ~/javaprojects/helloworld

Now, you can create a directory that will store your first class.

mkdir -p ~/javaprojects/helloworld/src/main/java/com/example

As you can see , I will put the code on 'src/main/java' . It's the path used to store all Java code files. From here, you will use the corresponding directory path depending on the package you will use in your class. In my case, it will be packaged in com.example, so I create a new file in:

~/javaprojects/helloworld/src/main/java/com/example/HelloWorld.java

Write the class:

package com.example;

public class HelloWorld{
   public static void main(String args[]){
      System.out.println("Hello World!");
   }
}
Now comes the interesting step. Writing the project descriptor. Maven use xml files called 'pom.xml'  to describe the project. This name comes from "Project Object Model". Maven will read it to obtain its information it in order to build the project properly.

In that case , we only need a minimal 'pom.xml' . You must create the 'pom.xml' file, on '~/javaprojects/helloworld/', and fill it with the following content:

<project>
   <artifactId>my-app</artifactId>
   <groupId>com.mycompany.app</groupId>
   <version>1&&lt/version> 
</project>
 That's all. We have finished to set up our first project. You can observe that there are three obligatory elements:
  • artifactId: Your project identifier.
  • groupId: A group identifier used to group your projects.
  • version: Version number of your project build.

How to build

In the following posts, we will pay more attention to building phases, but at this point we can use some of them:
  • mvn clean : Cleans all data generated in previous builds.
  • mvn compile : Compiles the project.
  • mvn package : Packages the project. Depending on the project type it will generate a type of output. By default generates a JAR.
  • mvn install : Installs the output package and the information project data on your local repository.
  • mvn deploy : Deploys the output package and project information on a remote repository.
  • mvn help:describe -Dcmd=install : Shows help about commands. In that case, it shows help about 'install' command.
If you come back to your project and execute 'mvn compile', you will observe that at the end of the building process, you will obtain a new directory called 'target'. This is the directory used by Maven to generate all the outputs. In your case, in that directory you will find a subdirectory with all compiled classes, 'classes/com/example/HelloWorld.class'. You can try your app going to 'target/classes', and executing the command:

java com.example.HelloWorld

On the next post I will talk about POMs.

Monday, December 17, 2012

Code management methodologies

I recently finished teaching a course on software deployment methodologies in the company I work. In this course I tried to introduce techniques, tools and procedures that help to improve quality and reduce time and cost on software development

 In these times, companies already use these techniques in a very natural way. It has been widely proved the advantages that entails the incorporation in the development life cycle. Very few are the ones that still work without a source control management system, development frameworks as well as test and deployment plans. In the vast majority of cases these tools are being used more and more and become essential.

If we take a look at companies that develop large software (even in open source software projects), in which many developers are implied,we can see how they apply team working techniques. One of the most popular is the "continuous integration". It is based on the fact of trying to bring the pieces of developed code to the main software as soon as possible. This technique allows all developers to work with the most advanced version of the project, at same time that merging the new code with the rest is less painful. To perform this technique properly, our development process must have the appropriate  tools to ensure the code quality  (via test) and the automation of this integration tasks. These mechanisms will reduce the time we spend on joining code, and assure the correct integration of all the parts of the software

Other tools that will help us to be successful on software development are project managers and documentation apps. These should allow us to plan, monitor and report whatever happen in software development process, since its conception to its delivery and still on its evolution.

In this course, in addition to a brief overview about the UP methodology, I focused on applying the techniques involved in the implementation phases. Since my work is based on Java development, I choose  tools that allow to apply these techniques in a natural way.

We saw:

  • Apache Maven: Framework for automation build, package and report of apps written in Java. (http://maven.apache.org/)
  • Sonatype Nexus: Repository management tool for Maven libraries. (http://www.sonatype.org/nexus/)
  • JUnit: Libraries to perform unit testing to applications written in Java. (https://github.com/kentbeck/junit/wiki)
  • Selenium: That tool helps to automate web applications for testing purposes. It's based on automatic execution through web browsers like Firefox, Chrome, Safari, etc..
  • Subversion: A source code management system. (http://subversion.tigris.org/)
  • Git: Another version SCM. The main difference between subversion and git is based on a the distributed model it uses. (http://git-scm.com/)
  • Jenkins: Web Framework that automates deployment tasks using all previous technologies. (http://jenkins-ci.org/)
  • Trac: Project management and bug/issue tracking system. It also helps us on documentation tasks. (http://trac.edgewall.org/)
  • Jira / Confluence: Two web applications that helps on planning and managing issues like Trac.


In short, I will publish some blog posts with the different sections mentioned above.