Jen-Ming Chung

Map-Reduce With MongoDB and Morphia

Just a note to record the usage of Map-Reduce with MongoDB and Morphia. Firstly, add the Morphia dependency.

1
2
3
4
5
6
7
8
9
10
<dependency>
    <groupId>org.mongodb</groupId>
    <artifactId>mongo-java-driver</artifactId>
    <version>2.11.2</version>
</dependency>
<dependency>
    <groupId>org.mongodb.morphia</groupId>
    <artifactId>morphia</artifactId>
    <version>0.105</version>
</dependency>

Using the MongoDB Aggregation Framework via MongoDB Asynchronous Java Driver

I often use Morphia in projects for mapping Java objects to/from MongoDB and it’s Query API instead of building the complex DBObject query. However, the current version (v. 0.105) without aggregate command support. Fortunately, the MongoDB Asynchronous Java Driver provides the aggregate builder to construct complex pipelines of operators in fluent way. As usual, the following paragraphs will express some basic usages of mongodb-async-driver in aggregation pipeline framework through an example.

Integrating Swagger Into JAX-RS With Java EE 6 Specification

Swagger is an awesome framework we often used to describe, consume and visualize our RESTful web services. Typically, we use Tomcat with Jersey as a servlet, then specify the Swagger package and Swagger Configuration class into web.xml, finally annotate the resources, methods and models to complete the configurations. Our team recently built a Java EE 7 application for a RESTful web service. The goal of this article is to share our experiences of configuring Swagger in Glassfish 4 without a web.xml.

How to Solve Jsoup Does Not Get Complete HTML Document

massive-bytes-content

Where crawling web pages by using jsoup, it only returns parts of HTML content if the document size is too large, e.g., the above example transferred over 6MB content. According to the jsoup’s API Reference the default maximum is 1MB. So that we can set jsoup connection with maxBodySize to zero to get rid of this limitation and may accompany with sufficient timeout property.

Set the maximum bytes to read from the (uncompressed) connection into the body, before the connection is closed, and the input truncated. The default maximum is 1MB. A max size of zero is treated as an infinite amount (bounded only by your patience and the memory available on your machine).

Moreover, if the server supports one or more compression schemas, the outgoing data may be compressed by one or more methods. We can set Accept-Encoding field in our jsoup connection with supported compression schema names (e.g., gzip) which separated by commas to satisfy the compression schemes.

1
2
3
4
5
6
Document = Jsoup.connect(url)
    .header("Accept-Encoding", "gzip, deflate")
    .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0")
    .maxBodySize(0)
    .timeout(600000)
    .get();

CentOS: Installing Apache Portable Runtime (APR) for Tomcat

In Tomcat, the default HTTP Connector is BIO (Blocking I/O) connector with stability, low concurrency characteristics. To boost the Tomcat performance, the alternative ways either adapt NIO (Non-Blocking I/O) or APR (Apache Portable Runtime) connector. Especially, the APR performance is generally better than others when using SSL protocol. For more details on performance among these connectors can reference the Mike Noordermeer’s comparison.