Archive for category Howtos

HTTP Basic Authentication for Sails.js 0.9.x using Passport

One of my initial excursions into Sails.js territory:

https://gist.github.com/adityamukho/6260759

, , , ,

Leave a comment

Minifying + Compressing an AngularJS App

This short tutorial demonstrates how to prepare an AngularJS app for deployment to a static web server, with all the bells and whistles needed to score an A on YSlow.

Key Assumptions

  1. Your dev setup is on Linux, or other Bash-compatible. (Build script written for Bash.)
  2. You use Git for SCM. (The build script will use git for some operations. Feel free to alter the script, and get rid of this dependency.)
  3. The app is structured as recommended by the angular-seed project. (Build script expects certain folders to be present at specific locations. You can adapt it to your project structure.)

Dependencies

Go grab the following:

  1. YUI Compressor
  2. Git
  3. Gzip (Installed by default on most *NIX systems. Pull from distro repos otherwise.)
  4. Node.js (For testing stuff on your localhost. Not required if you have/prefer some other server for delivering static content.)
  5. Stomach for shell scripts

Prologue

I wanted to split my web application into two distinct components:

  1. A client-side, JS-driven presentation layer.
  2. A lightweight, REST-based backend.

I’ve had to sort out a lot of issues to get both to cooperate while running on different servers on different domains, use digest-based authentication instead of cookies (REST is stateless), and so on, but that’s another post. This one focuses on efficiently delivering the UI portion – HTML + CSS + JS + Media – which from a server POV is static content.

Preparing AngularJS Scripts for Minification

The AngularJS docs provide some information on how to prepare controllers for minification here. Quoting from the page:

Since angular infers the controller’s dependencies from the names of arguments to the controller’s constructor function, if you were to minify the JavaScript code for PhoneListCtrl controller, all of its function arguments would be minified as well, and the dependency injector would not be able to identify services correctly.

PhoneListCtrl is part of the angular-phonecat application, used for driving the on-site tutorial.

Basically, every controller defined by your application needs to be explicitly injected with whatever dependencies it has. For the example above, it looks something like:

PhoneListCtrl.$inject = ['$scope', '$http'];

There is one more way defined on the site, but I prefer the method above.

However, this is not enough to get minified scripts working right. YUI Compressor changes closure parameter names, and this doesn’t go down well with Angular. You need to use inline annotations in defining custom services. You can find a usage example here.

Additionally, you can collate all content from controllers.js, directives.js, services.js and filters.js into app.js to reduce the number of calls made to the server.
Don’t forget to modify your index.html / index-async.html to reflect this change.

The Build Script

If you’re sticking to the folder structure provided by angular-seed, you’ll have an app folder in your project root. Adjacent to this, create a build folder to contain the minified and compressed output files generated by the build script. You can tell git to ignore this folder by adding the following line to .gitignore:

/build/*

You can put your build script anywhere you like, and run it from anywhere in the project folder. I have put it inside the conveniently provided scripts folder.

#!/bin/bash

ccred=$(echo -e "\033[0;31m")
ccyellow=$(echo -e "\033[0;33m")
ccgreen=$(echo -e "\033[0;32m")
ccend=$(echo -e "\033[0m")

exit_code=0

cd "$(git rev-parse --show-toplevel)"

#Minify
echo -e "$ccyellow========Minify========$ccend"
for ext in 'css' 'js'
do
    for infile in `find ./app -name *.$ext |grep -v min`
    do
        outfile="$(echo $infile |sed 's/\(.*\)\..*/\1/').min.$ext"
        echo -n -e "\nMinifying $infile to $outfile: "
        if [ ! -f "$outfile" ] || [ "$infile" -nt "$outfile" ]
        then
            yuicompressor "$infile" > "$outfile"
            if [ `echo $?` != 0 ]
            then
                exit_code=1
                echo -e "\n$ccred========Failed minification of $infile to $outfile . Reverting========$ccend\n" >&2
                git checkout -- "$outfile" || rm -f "$outfile"
            else
                echo $ccgreen Success.$ccend
            fi
        else
            echo $ccgreen Not modified.$ccend
        fi
    done
done

#Compress / Copy
echo -e "\n\n$ccyellow========Compress / Copy========$ccend\n"
for infile in `find ./app -type f -not -empty`
do
    filetype="$(grep -r -m 1 "^" "$infile" |grep '^Binary file')"
    outfile="./build/$(echo $infile |cut -c7-)"

    mkdir -p $(dirname "$outfile")
    if [ ! -f "$outfile" ] || [ "$infile" -nt "$outfile" ]
    then
        if [ "$filetype" = "" ] #Compress text files
        then
            echo -n -e "\nCompressing $infile to $outfile: "
            gzip -c "$infile" > "$outfile"
        else #Copy binary files as is
            echo -e -n "\nCopying $infile to $outfile: "
            cp "$infile" "$outfile"
        fi
        if [ `echo $?` != 0 ]
        then
            exit_code=2
            echo -e "\n$ccred========Failed compress / copy of $infile to $outfile . Reverting========$ccend\n" >&2
        else
            echo $ccgreen Success.$ccend
        fi
    else
        echo -e "$infile -> $outfile: $ccgreen Not modified.$ccend\n"
    fi
done

echo -e "\n$ccyellow========Finished========$ccend"
exit $exit_code

Once you run this script, every app/file.[css | js] would have a working copy at build/file.min.[css | js]. Every other file in the app folder will be either:

  1. compressed and copied (name unchanged) into the build folder if it is a text file, or
  2. simply copied into the build folder if it is a binary file (like an image).

Your CSS and JS references need to be updated to their corresponding min versions in index.html / index-async.html.

Now that you’ve got a compressed, minified version of your app in the build folder, you can deploy it to any static server. But you do need to set your HTTP response headers properly, or the browser WILL show garbage. Most importantly, any compressed content must be served with the HTTP response header:

Content-Encoding: gzip

Additionally, for every file that is static content, it makes sense to set a far future date using an Expires header similar to the following:

Expires: Thu, 31 Dec 2037 20:00:00 GMT

The NodeJS web-server.js Script

The contents of the build folder are technically ready to be uploaded to any web server, but you will probably want to run the app from your localhost to first check if everything works fine. The built-in web-server.js is very useful to quickly launch and test your app, but it needs a few mods in order to serve the compressed content from the build folder correctly. The Content-Encoding header is sufficient to render the page correctly, but if you’re a stickler for good YSlow grades even on your localhost, you will want to add the Expires headers as well. Search for the following response codes in your web-server.js and add the lines listed below:

  1. 200 (writeDirectoryIndex), 500, 404, 403, 301:
    'Expires': 'Thu, 31 Dec 2037 20:00:00 GMT',
    
  2. 200 (sendFile) (After var file = fs.createReadStream(path);):
    var fileType = StaticServlet.MimeMap[path.split('.').pop()];
    var contentType = fileType || 'text/plain';
    res.writeHead(200, {
        'Content-Type': contentType,
        'Expires': 'Thu, 31 Dec 2037 20:00:00 GMT',
        'Content-Encoding': ((path.indexOf('./build') === 0) && ((contentType.indexOf('text') === 0) || (contentType.indexOf('application/javascript') === 0))) ? 'gzip' : ''
      });
    

That’s it! Now when you run the web-server.js, all content from the build folder will be correctly served with the ‘gzip’ header (unless it is a binary).

, , , , ,

2 Comments

Setting up custom network routes

On some networks, I need to connect to a (firewalled) intranet over wired ethernet, while general unrestricted network access is available over WiFi. Typically I need to stay connected to both networks so as to access machines on the LAN as well as the WWW. Trouble is (at least on my F17 machines) the system is configured to use the ethernet interface (if live) by default for all outbound requests, regardless of whether the WiFi is enabled or not.

This is not a convenient situation as the LAN is often configured to block access to requests going outside the local subnet. This means every time I have to go online, I need to disable my ethernet Iface first! The source of this endless bother can be traced down to the way the system has setup its routing. Just fire up a terminal and issue the following command to get your current routes. In one such run I get the following output:

$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.11.2    0.0.0.0         UG    0      0        0 p1p1
192.168.0.0     0.0.0.0         255.255.255.0   U     0      0        0 wlan0
192.168.8.0     0.0.0.0         255.255.252.0   U     0      0        0 p1p1

This tells me that the default route for all outbound requests (those that do not specifically match any other rule) is through Iface p1p1 (ethernet or wired LAN). I need this to be set to wlan0 (WiFi) instead.

That is done (as root) by first deleting the existing default route, followed by adding a new rule to route default requests through WiFi:

# route del default
# route add -net 0.0.0.0 dev wlan0 gw 192.168.0.1
# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.0.1     0.0.0.0         UG    0      0        0 wlan0
192.168.0.0     0.0.0.0         255.255.255.0   U     0      0        0 wlan0
192.168.8.0     0.0.0.0         255.255.252.0   U     0      0        0 p1p1

The gateway IP for the default route should be the default gateway for your WiFi.

Post these steps, the system will route requests within the LAN through p1p1 (note that this route was already configured for p1p1 in my case and is a stricter rule than all the others, hence is the first to match) and outbound traffic to non-local addresses through wlan0.

, ,

Leave a comment

Setup a local DNS server to support wildcard sub-domains on localhost

I often develop sites/web applications that provide some common or core functionality at a top level domain and use sub-domains for hosting portals, micro-sites or related apps. While developing these apps on my local machine I might have a dozen or so portals running under sub-domains of a top level local domain. It’s possible to add an entry to the hosts file, one for the top domain and one for each sub-domain, but the number of sub-domains may quickly grow big enough to render this method way too cumbersome.

To get around this hurdle, I run a lightweight DNS server locally, that has support for wildcard sub-domains. It’s called dnsmasq and is available as a standard package on most Linux systems. It installs as a system service, and is configured through the file /etc/dnsmasq.conf (on most rpm-based systems).

Below is a quick round-up of the bare minimum settings you need to enable and configure in order to get up and running:

# Never forward plain names (without a dot or domain part)
domain-needed

# Never forward addresses in the non-routed address spaces.
bogus-priv

# This option only affects forwarding, SRV records originating for
# dnsmasq (via srv-host= lines) are not suppressed by it.
filterwin2k

# Add domains which you want to force to an IP address here.
# The example below send any host in double-click.net to a local
# web-server.
address=/double-click.net/127.0.0.1
address=/localhost/127.0.0.1

# The IP address to listen on
listen-address=127.0.0.1

Restart dnsmasq after these changes and run something like:

$ ping xyz.localhost

to ensure your settings are correct.

,

Leave a comment

Building a Single Sign-On Module for the BIRT Report Viewer – Part 3-2

This is the fourth (actually second part of the third) post of the BIRT SSO Series wherein I describe the implementation of a single sign-on module for the Eclipse BIRT Report Viewer. The introduction and server configuration are covered in  Part 1 and Part 2 respectively. The Drupal Module component follows in Part 3-1. It is recommended that you read them first in order to get acquainted with the background and the premises on which this solution is built. In this post, I describe the BIRT Module.

3.2: The BIRT Module

In Part 3-1, I described how the Drupal Module encrypts a string of information and sends it over to the BIRT component. Once the BIRT Module has received this encrypted data, it needs to decrypt and process the string to provide or revoke authentication for a particular user. There are 4 classes that are involved in orchestrating this process:

  1. AuthFilter: An instance of javax.servlet.Filter that intercepts all incoming requests to BIRT’s servlets and allows or rejects them based on session authentication.
  2. AuthManagerServlet: A subclass of javax.servlet.http.HttpServlet that receives and processes incoming authentication requests from the Drupal Module.
  3. SessionLifecycleListener: An instance of javax.servlet.http.HttpSessionListener that listens for session events generated by client-connects and session timeouts, and does some voodoo.
  4. Transcoder: A simple utility class that provides methods for decryption and checksums.

Before I get into the nitty-gritties of the individual classes, a little explanation of the overall flow is needed. Briefly, here’s what transpires:

  1. AuthManagerServlet receives an encrypted request. It attempts to decrypt it using Transcoder‘s utility methods. If successful, the following parameters now become available to it:
    1. A universally unique session id (session created by Drupal).
    2. A flag denoting whether this is a login or a logout operation.
    3. timeout which can be set for a newly created session (in case of a login).
  2. In case this is a login operation, this session id is stored in a map (as the key, with NULLfor value).
    1. Every request coming in from the client’s browser will contain a cookie with this session id.
    2. This request has come in directly from the Drupal server, so the session created on the Java side for this request does not map to the actual client browser.
  3. When the client sends its first request to the reporting component, a new Java session is created for it, and the session map is updated with the session id of this session. So now we have a Drupal session mapped to a Java session, cookies for both being stored in the client browser. The timeout value is also set, at this point, for the newly created Java session.
  4. All further requests coming in from the client are validated for the correct Drupal and Java session ids.
  5. If the original encrypted request was a logout operation, then the appropriate entry is removed from the session map.
  6. A session map entry can also be removed if triggered by a session timeout.

Now that we have the flow in mind, let’s dive right into the code. With the background knowledge given above, the logic should be fairly easy to follow.

public class AuthFilter implements Filter
{

  private static final boolean debug = false;
  // The filter configuration object we are associated with.  If
  // this value is null, this filter instance is not currently
  // configured.
  private FilterConfig filterConfig = null;

  public AuthFilter ()
  {
  }

  private boolean validateSession (HttpServletRequest request, Map<String, String> authorizedSessions)
  {
    Cookie[] cookies = request.getCookies ();
    if (cookies != null)
    {
      for (int i = 0; i < cookies.length; ++i)
      {
        String remoteSession = cookies[i].getValue ();
        if (authorizedSessions.containsKey (remoteSession))
        {
          String localSession = authorizedSessions.get (remoteSession);
          String jSessionId = request.getSession ().getId ();
          if (localSession == null)
          {
            authorizedSessions.put (remoteSession, jSessionId);
            return true;
          }
          else if (localSession.equals (jSessionId) )
          {
            return true;
          }
          break;
        }
      }
    }
    return false;
  }

  /**
   *
   * @param request The servlet request we are processing
   * @param response The servlet response we are creating
   * @param chain The filter chain we are processing
   *
   * @exception IOException if an input/output error occurs
   * @exception ServletException if a servlet error occurs
   */
  @Override
  public void doFilter (ServletRequest request, ServletResponse response,
                        FilterChain chain)
      throws IOException, ServletException
  {

    if (debug)
    {
      log ("AuthFilter:doFilter()");
    }

    // Create wrappers for the request and response objects.
    // Using these, you can extend the capabilities of the
    // request and response, for example, allow setting parameters
    // on the request before sending the request to the rest of the filter chain,
    // or keep track of the cookies that are set on the response.
    //
    // Caveat: some servers do not handle wrappers very well for forward or
    // include requests.
    RequestWrapper wrappedRequest = new RequestWrapper ((HttpServletRequest) request);
    ResponseWrapper wrappedResponse = new ResponseWrapper ((HttpServletResponse) response);

    Map<String, String> authorizedSessions = (Map<String, String>) wrappedRequest.getServletContext ().getAttribute (AuthManagerServlet.class.getPackage ().getName () + "." + AuthManagerServlet.class.getName () + ".authorizedSessions");
    if (authorizedSessions == null)
    {
      wrappedResponse.sendError (HttpServletResponse.SC_UNAUTHORIZED, "Unauthorized");
      return;
    }

    if (!validateSession (wrappedRequest, authorizedSessions))
    {
      wrappedResponse.sendError (HttpServletResponse.SC_UNAUTHORIZED, "Unauthorized");
      return;
    }

    Throwable problem = null;

    try
    {
      chain.doFilter (wrappedRequest, wrappedResponse);
    }
    catch (IOException | ServletException t)
    {
      // If an exception is thrown somewhere down the filter chain,
      // we still want to execute our after processing, and then
      // rethrow the problem after that.
      problem = t;
    }

    // If there was a problem, we want to rethrow it if it is
    // a known type, otherwise log it.
    if (problem != null)
    {
      if (problem instanceof ServletException)
      {
        throw (ServletException) problem;
      }
      if (problem instanceof IOException)
      {
        throw (IOException) problem;
      }
      sendProcessingError (problem, response);
    }
  }

  /**
   * Return the filter configuration object for this filter.
   */
  public FilterConfig getFilterConfig ()
  {
    return (this.filterConfig);
  }

  /**
   * Set the filter configuration object for this filter.
   *
   * @param filterConfig The filter configuration object
   */
  public void setFilterConfig (FilterConfig filterConfig)
  {
    this.filterConfig = filterConfig;
  }

  /**
   * Destroy method for this filter
   */
  @Override
  public void destroy ()
  {
  }

  /**
   * Init method for this filter
   */
  @Override
  public void init (FilterConfig filterConfig)
  {
    this.filterConfig = filterConfig;
    if (filterConfig != null)
    {
      if (debug)
      {
        log ("AuthFilter: Initializing filter");
      }
    }
  }

  /**
   * Return a String representation of this object.
   */
  @Override
  public String toString ()
  {
    if (filterConfig == null)
    {
      return ("AuthFilter()");
    }
    StringBuilder sb = new StringBuilder ("AuthFilter(");
    sb.append (filterConfig);
    sb.append (")");
    return (sb.toString ());

  }

  private void sendProcessingError (Throwable t, ServletResponse response)
  {
    String stackTrace = getStackTrace (t);

    if (stackTrace != null && !stackTrace.equals (""))
    {
      try
      {
        response.setContentType ("text/html");
        try (PrintStream ps = new PrintStream (response.getOutputStream ()); PrintWriter pw = new PrintWriter (ps))
        {
pw.print ("\n\nError\n\n\n"); //NOI18N

          // PENDING! Localize this for next official release
          pw.print ("</pre>
<h1>The resource did not process correctly</h1>
<pre>

\n
\n");
          pw.print (stackTrace);
          pw.print ("

\n"); //NOI18N
 }
 response.getOutputStream ().close ();
 }
 catch (Exception ex)
 {
 }
 }
 else
 {
 try
 {
 try (PrintStream ps = new PrintStream (response.getOutputStream ()))
 {
 t.printStackTrace (ps);
 }
 response.getOutputStream ().close ();
 }
 catch (Exception ex)
 {
 }
 }
 }

 public static String getStackTrace (Throwable t)
 {
 String stackTrace = null;
 try
 {
 StringWriter sw = new StringWriter ();
 PrintWriter pw = new PrintWriter (sw);
 t.printStackTrace (pw);
 pw.close ();
 sw.close ();
 stackTrace = sw.getBuffer ().toString ();
 }
 catch (Exception ex)
 {
 }
 return stackTrace;
 }

 public void log (String msg)
 {
 filterConfig.getServletContext ().log (msg);
 }

 /**
 * This request wrapper class extends the support class HttpServletRequestWrapper, which implements all the methods in the
 * HttpServletRequest interface, as delegations to the wrapped request. You only need to override the methods that you need to change. You
 * can get access to the wrapped request using the method getRequest()
 */
 class RequestWrapper extends HttpServletRequestWrapper
 {

 public RequestWrapper (HttpServletRequest request)
 {
 super (request);
 }
 // You might, for example, wish to add a setParameter() method. To do this
 // you must also override the getParameter, getParameterValues, getParameterMap,
 // and getParameterNames methods.
 protected HashMap localParams = null;

 public void setParameter (String name, String[] values)
 {
 if (debug)
 {
 System.out.println ("AuthFilter::setParameter(" + name + "=" + values + ")" + " localParams = " + localParams);
 }

 if (localParams == null)
 {
 localParams = new HashMap ();
 // Copy the parameters from the underlying request.
 Map wrappedParams = getRequest ().getParameterMap ();
 Set keySet = wrappedParams.keySet ();
 for (Iterator it = keySet.iterator (); it.hasNext ();)
 {
 Object key = it.next ();
 Object value = wrappedParams.get (key);
 localParams.put (key, value);
 }
 }
 localParams.put (name, values);
 }

 @Override
 public String getParameter (String name)
 {
 if (debug)
 {
 System.out.println ("AuthFilter::getParameter(" + name + ") localParams = " + localParams);
 }
 if (localParams == null)
 {
 return getRequest ().getParameter (name);
 }
 Object val = localParams.get (name);
 if (val instanceof String)
 {
 return (String) val;
 }
 if (val instanceof String[])
 {
 String[] values = (String[]) val;
 return values[0];
 }
 return (val == null ? null : val.toString ());
 }

 @Override
 public String[] getParameterValues (String name)
 {
 if (debug)
 {
 System.out.println ("AuthFilter::getParameterValues(" + name + ") localParams = " + localParams);
 }
 if (localParams == null)
 {
 return getRequest ().getParameterValues (name);
 }
 return (String[]) localParams.get (name);
 }

 @Override
 public Enumeration getParameterNames ()
 {
 if (debug)
 {
 System.out.println ("AuthFilter::getParameterNames() localParams = " + localParams);
 }
 if (localParams == null)
 {
 return getRequest ().getParameterNames ();
 }
 return Collections.enumeration (localParams.keySet ());
 }

 @Override
 public Map getParameterMap ()
 {
 if (debug)
 {
 System.out.println ("AuthFilter::getParameterMap() localParams = " + localParams);
 }
 if (localParams == null)
 {
 return getRequest ().getParameterMap ();
 }
 return localParams;
 }
 }

 /**
 * This response wrapper class extends the support class HttpServletResponseWrapper, which implements all the methods in the
 * HttpServletResponse interface, as delegations to the wrapped response. You only need to override the methods that you need to change.
 * You can get access to the wrapped response using the method getResponse()
 */
 class ResponseWrapper extends HttpServletResponseWrapper
 {

 public ResponseWrapper (HttpServletResponse response)
 {
 super (response);
 }
 // You might, for example, wish to know what cookies were set on the response
 // as it went throught the filter chain. Since HttpServletRequest doesn't
 // have a get cookies method, we will need to store them locally as they
 // are being set.
 /*
 * protected Vector cookies = null;
 *
 * // Create a new method that doesn't exist in HttpServletResponse public Enumeration getCookies() { if (cookies == null) cookies =
 * new Vector(); return cookies.elements(); }
 *
 * // Override this method from HttpServletResponse to keep track // of cookies locally as well as in the wrapped response. public void
 * addCookie (Cookie cookie) { if (cookies == null) cookies = new Vector(); cookies.add(cookie);
 * ((HttpServletResponse)getResponse()).addCookie(cookie); }
 */
 }
}
public class AuthManagerServlet extends HttpServlet
{

  public static final String ENCRYPTION_KEY = "ENCRYPTION_KEY";
  private Map<String, String> authorizedSessions = new HashMap<> ();
  private String encryptionKey, initialVector = null;
  private Timer timer = new Timer (true);

  private enum Operations
  {

    LOGIN, LOGOUT
  }

  /**
   * Processes requests for both HTTP
   * <code>GET</code> and
   * <code>POST</code> methods.
   *
   * @param request servlet request
   * @param response servlet response
   * @throws ServletException if a servlet-specific error occurs
   * @throws IOException if an I/O error occurs
   */
  protected void processRequest (HttpServletRequest request, HttpServletResponse response)
      throws ServletException, IOException
  {
    response.setContentType ("text/html;charset=UTF-8");
    PrintWriter out = response.getWriter ();
    String message = null;
    try
    {
      String encData = request.getParameter ("data");
      if (encData == null)
      {
        message = "Operation failed. Data cannot be null.";
        response.sendError (HttpServletResponse.SC_UNAUTHORIZED, message);
      }
      else
      {
        String params = Transcoder.decrypt (encData, initialVector, encryptionKey);
        StringTokenizer st = new StringTokenizer (params, "&=");
        final Map<String, String> paramMap = new HashMap<> ();
        while (st.hasMoreTokens ())
        {
          String name = st.nextToken ();
          String value = st.nextToken ();
          paramMap.put (name, value);
        }

        Operations ops;
        try
        {
          ops = Operations.valueOf (paramMap.get ("op").toUpperCase ());
          switch (ops)
          {
            case LOGIN:
              authorizedSessions.put (paramMap.get ("session_id"), null);
              int timeout = Integer.parseInt (paramMap.get ("timeout"));

              //Timeout is received with each auth request, but is set globally for all future sessions.
              request.getServletContext ().setAttribute (getClass ().getPackage ().getName () + "." + getClass ().getName () + ".sessionTimeout", timeout);
              message = "Login successful.";
              break;
            case LOGOUT:
              authorizedSessions.remove (paramMap.get ("session_id"));
              message = "Logout successful.";
              break;
          }
        }
        catch (IllegalArgumentException e)
        {
          message = "Operation failed. Invalid op.";
          response.sendError (HttpServletResponse.SC_UNAUTHORIZED, message);
        }
      }
    }
    finally
    {
      out.println (message);
      out.close ();
    }
  }

  //
  /**
   * Handles the HTTP
   * <code>GET</code> method.
   *
   * @param request servlet request
   * @param response servlet response
   * @throws ServletException if a servlet-specific error occurs
   * @throws IOException if an I/O error occurs
   */
  @Override
  protected void doGet (HttpServletRequest request, HttpServletResponse response)
      throws ServletException, IOException
  {
    processRequest (request, response);
  }

  /**
   * Handles the HTTP
   * <code>POST</code> method.
   *
   * @param request servlet request
   * @param response servlet response
   * @throws ServletException if a servlet-specific error occurs
   * @throws IOException if an I/O error occurs
   */
  @Override
  protected void doPost (HttpServletRequest request, HttpServletResponse response)
      throws ServletException, IOException
  {
    processRequest (request, response);
  }

  /**
   * Returns a short description of the servlet.
   *
   * @return a String containing servlet description
   */
  @Override
  public String getServletInfo ()
  {
    return "Simple Authentication Manager Servlet, based on background info recieved from authorized servers.";
  }//

  @Override
  public void init (ServletConfig config) throws ServletException
  {
    super.init (config);

    ServletContext context = config.getServletContext ();
    context.setAttribute (getClass ().getPackage ().getName () + "." + getClass ().getName () + ".authorizedSessions", authorizedSessions);

    encryptionKey = context.getInitParameter (ENCRYPTION_KEY);
    if (encryptionKey == null)
    {
      Logger.getLogger (getClass ().getName ()).log (Level.SEVERE, "Error!! Encryption key not found.");
    }
    try
    {
      initialVector = Transcoder.md5 (Transcoder.md5 (encryptionKey)).substring (0, 16);
    }
    catch (NoSuchAlgorithmException ex)
    {
      Logger.getLogger (getClass ().getName ()).log (Level.SEVERE, null, ex);
    }
  }
}
@WebListener ()
public class SessionLifecycleListener implements HttpSessionListener
{

  @Override
  public void sessionCreated (HttpSessionEvent hse)
  {
    int sessionTimeout = (int) hse.getSession ().getServletContext ().getAttribute (AuthManagerServlet.class.getPackage ().getName () + "." + AuthManagerServlet.class.getName () + ".sessionTimeout");
    if (sessionTimeout <= 0)
    {
      sessionTimeout = 600;
    }
    hse.getSession ().setMaxInactiveInterval (sessionTimeout);
  }

  @Override
  public void sessionDestroyed (HttpSessionEvent hse)
  {
    Map<String, String> authorizedSessions = (Map<String, String>) hse.getSession ().getServletContext ().getAttribute (AuthManagerServlet.class.getPackage ().getName () + "." + AuthManagerServlet.class.getName () + ".authorizedSessions");
    String jSessionId = hse.getSession ().getId ();
    Collection values = authorizedSessions.values ();
    for (Iterator i = values.iterator (); i.hasNext ();)
    {
      String value = i.next ();
      if (jSessionId.equals (value))
      {
        authorizedSessions.remove (value);
        break;
      }
    }
  }
}
public class Transcoder
{

  private Transcoder ()
  {
  }

  public static String md5 (String input) throws NoSuchAlgorithmException
  {
    MessageDigest md = MessageDigest.getInstance ("MD5");
    byte[] messageDigest = md.digest (input.getBytes ());
    BigInteger number = new BigInteger (1, messageDigest);
    return number.toString (16);
  }

  public static String decrypt (String encryptedData, String initialVectorString, String secretKey)
  {
    String decryptedData = null;
    try
    {
      SecretKeySpec skeySpec = new SecretKeySpec (md5 (secretKey).getBytes (), "AES");
      IvParameterSpec initialVector = new IvParameterSpec (initialVectorString.getBytes ());
      Cipher cipher = Cipher.getInstance ("AES/CFB8/NoPadding");
      cipher.init (Cipher.DECRYPT_MODE, skeySpec, initialVector);
      byte[] encryptedByteArray = (new org.apache.commons.codec.binary.Base64 ()).decode (encryptedData.getBytes ());
      byte[] decryptedByteArray = cipher.doFinal (encryptedByteArray);
      decryptedData = new String (decryptedByteArray, "UTF-8");
    }
    catch (NoSuchAlgorithmException | NoSuchPaddingException | InvalidKeyException | InvalidAlgorithmParameterException | IllegalBlockSizeException | BadPaddingException | UnsupportedEncodingException e)
    {
      Logger.getLogger (Transcoder.class.getName ()).log (Level.SEVERE, "Problem decrypting the data", e);
    }
    return decryptedData;
  }
}

Finally a few entries go into web.xml to tie it all together:

<!--
Report resources directory for preview. Defaults to ${birt home}
-->
<context-param>
<param-name>BIRT_VIEWER_WORKING_FOLDER</param-name>
<param-value>report</param-value>
</context-param>

<context-param>
    <description>The secret key used to encrypt/decrypt informations between servers and browsers. Must never be transmitted!</description>
    <param-name>ENCRYPTION_KEY</param-name>
    <param-value>{your encryption key}</param-value>
</context-param>

<filter>
    <filter-name>AuthFilter</filter-name>
    <filter-class>{package.path}.AuthFilter</filter-class>
</filter>
<filter-mapping>
    <filter-name>AuthFilter</filter-name>
    <servlet-name>EngineServlet</servlet-name>
</filter-mapping>
<filter-mapping>
    <filter-name>AuthFilter</filter-name>
    <servlet-name>ViewerServlet</servlet-name>
</filter-mapping>

<servlet>
    <servlet-name>AuthManagerServlet</servlet-name>
    <servlet-class>{package.path}.AuthManagerServlet</servlet-class>
</servlet>
<servlet-mapping>
    <servlet-name>AuthManagerServlet</servlet-name>
    <url-pattern>/smanage</url-pattern>
</servlet-mapping>

<session-config>
    <session-timeout>10</session-timeout>
</session-config>

<listener>
    <listener-class>{package.path}.SessionLifecycleListener</listener-class>
</listener>

That concludes this series on implementing an SSO for the Birt Report Viewer. Hope it helps you build your own custom implementation. If you have any questions, observations or suggestions, just leave a comment and I’ll answer as best as I can.

, , , , ,

34 Comments

Building a Single Sign-On Module for the BIRT Report Viewer – Part 3-1

This is the third (actually first part of the third) post of the BIRT SSO Series wherein I describe the implementation of a single sign-on module for the Eclipse BIRT Report Viewer. This post deals with the technical details of the module itself. The introduction and server configuration are covered in  Part 1 and Part 2 respectively. It is recommended that you read them first in order to get acquainted with the background and the premises on which this solution is built. This post is split into two to keep the individual post lengths sane.

Part 3: The Module

There are actually two parts to the module – one which runs on the Drupal application and listens to user login and logout events, and communicates these events to the reporting component; the other sits in the report viewer webapp, listens for requests and session lifecycle events, and manages user authentication. You will have to design the first part specifically for the platform on which your main application runs, using my Drupal example as a reference. The second part, which runs in the report viewer, can be used as is. For the sake of brevity, I will refer to them hereon as the Drupal Module and the BIRT Module respectively.

3.1: The Drupal Module

For those of you familiar with the Drupal 7 API (and in particular the Field API), it may be of interest to know that this module defines its own field type, which can be attached to any entity type. All configuration options (listed below) are defined at a field instance level. There are two display modes: Embedded Report and External Link. The embedded report may be useful when you want to show the report (in an iframe) when the node page is loaded. The external link mode is more suitable for lists and tables.

However, letting too many of the Field API-specific details into this post would make it nearly impossible to follow for those who do not come from a Drupal programming background. I will therefore try to keep things as generic as possible.

For any implementation of this scheme to work, the following events should be generated by the underlying platform, and our module should be able to latch on to them to do its own stuff:

  1. User login
  2. User logout
  3. Session timeout (optional)
  4. Server shutdown (optional)

My implementations of these event hooks are shown in the code listings below. There is some configuration that needs to be done though, to inform this module about some details of the reporting component, like its location, servlet mappings, shared encryption key etc. In my case, it is possible to define them at an individual field instance level. You may choose to define them once globally if you like:

  1. Report Server URL: The fully qualified URL of the server where the BIRT engine webapp is running.
  2. Report Server Subfolder (Optional): If BIRT report files are being placed (see note below for how they can be made accessible to both Drupal and the reporting webapp) into a subfolder of BIRT_VIEWER_WORKING_FOLDER (web.xml of BIRT webapp), specify that here.
  3. Session Management Servlet: User sessions for authenticating users to the reporting server are automatically managed by Drupal behind the scenes. If for some reason you need to map your session management servlet to something other than the default (smanage in web.xml of BIRT webapp), you can tell the BIRT Reports module about it here.
  4. Report Viewer Servlet: The name of the servlet used to run and display the live reports. Most users would want to set this to frameset.
  5. Encryption Key: The secret key used for encrypting data transmitted to the report server. This key must be set in the report server configuration as well.

Note: The report design files must be accessible to the reporting server (if the Drupal frontend is being used to upload them, it becomes important to get this right). If Drupal and the reporting server are running on the same machine, this is easily achieved by creating a symbolic link pointing from BIRT_VIEWER_WORKING_FOLDER to the (preferably) subfolder inside the Drupal files folder where the reports are getting saved. If the servers are running on different machines, then the files must be made commonly accessible to both servers using NFS mounts or some other technique, the setup of which is beyond the scope of this article.

/**
* Snippets from the main module file. One problem with Drupal (as of 7.14) is that the session id is not available at the time the user login hook is fired.
* To get around this limitation, I need to define my own table for tracking sessions and mapping them to user ids.
* The init hook, which is one of the first to fire on every page load, looks for entries in the table with blank session ids and does the requisite post-login processing.
* As a consequence of introducing this additional table, some additional logic is required to update its data on each event.
**/

/**
 * Implements hook_user_logout().
 */
function birt_reports_user_logout($account) {
  db_delete('birt_reports_sessions')
          ->condition('uid', $account->uid)
          ->condition('sid', $account->sid)
          ->execute();
  $birt_reports_auth_sessions = cache_get('birt_reports_auth_sessions');
  if (!empty($birt_reports_auth_sessions)) {
    unset($birt_reports_auth_sessions->data[$account->sid]);
    cache_set('birt_reports_auth_sessions', $birt_reports_auth_sessions->data, 'cache', CACHE_PERMANENT);
  }
  _birt_reports_report_server_auth($account->sid, 'logout');
}

/**
 * Implements hook_user_login().
 */
function birt_reports_user_login(&$edit, $account) {
  //Session ID is not yet present. So using roundabout method.
  if (user_access('access report servers', $account)) {
    db_insert('birt_reports_sessions')
            ->fields(array('uid' => $account->uid))
            ->execute();
  }
}

function _birt_reports_report_server_auth($sid, $op, $timeout = 600) {
  $params = "session_id=$sid&op=$op&timeout=$timeout";

  //Every time a field instance is created/updated, its data is also saved to the variables table for easy retrieval later on.
  $encryption_keys = variable_get('birt_reports_encryption_keys', array());
  $report_servers = variable_get('birt_reports_active_report_servers', array());
  $session_servlets = variable_get('birt_reports_session_servlets', array());
  $options = array(
      'method' => 'POST',
      'headers' => array('Content-Type' => 'application/x-www-form-urlencoded'),
  );

  //Inform all registered report servers of new event.
  foreach ($report_servers as $id => $server) {
    $key = $encryption_keys[$id];
    if (empty($key)) {
      watchdog('birt_reports', 'Encryption key not set for report server @server. Skipping authentication.', array('@server' => $server), WATCHDOG_WARNING);
      continue;
    }
    $servlet = empty($session_servlets[$id]) ? 'smanage' : $session_servlets[$id];
    $url = $server . "/$servlet";

    //AES 128 bit encryption requires the initial vector to be 16 chars long.
    //This is strictly enforced in Java, though not in PHP.
    $iv = substr(md5(md5($key)), 0, 16);
    $data = _birt_reports_encrypt($params, $iv, $key);
    $options['data'] = 'data=' . urlencode($data);
//      dpm ($data);

    $result = drupal_http_request($url, $options);
    if (isset($result->error)) {
      watchdog('birt_reports', 'Error logging in/out from report server @server. Error is: @error', array('@server' => $server, '@error' => "$result->code $result->error"), WATCHDOG_ERROR);
    }
    else {
      watchdog('birt_reports', 'Successful auth transaction with report server @server. Message is: @message', array('@server' => $server, '@message' => "$result->status_message"), WATCHDOG_INFO);
    }
  }
}

function _birt_reports_encrypt($message, $initialVector, $secretKey) {
  return base64_encode(mcrypt_encrypt(MCRYPT_RIJNDAEL_128, md5($secretKey), $message, MCRYPT_MODE_CFB, $initialVector));
}

/**
 * Implements hook_init().
 */
function birt_reports_init() {
  global $user;
  if (user_access('access report servers')) {
    $birt_reports_auth_sessions = cache_get('birt_reports_auth_sessions');
    if (empty($birt_reports_auth_sessions)) {
      $birt_reports_auth_sessions = new stdClass();
      $birt_reports_auth_sessions->data = array();
    }
    if (!in_array($user->sid, $birt_reports_auth_sessions->data)) {
      $result = db_select('birt_reports_sessions', 'b')
              ->fields('b', array('uid', 'sid'))
              ->condition('b.uid', $user->uid, '=')
              ->condition('b.sid', '0', '=')
              ->execute()
              ->fetchObject();
      $uid = $result->uid;
      if ($uid) {
        db_update('birt_reports_sessions')
                ->fields(array('sid' => $user->sid))
                ->condition('uid', $uid)
                ->condition('sid', '0')
                ->execute();

        $timeout = ini_get('session.cookie_lifetime');
        if (!is_numeric($timeout) || ($timeout < 0)) {            $timeout = 600;          }          _birt_reports_report_server_auth($user->sid, 'login', $timeout);
      }
      $birt_reports_auth_sessions->data[$user->sid] = $user->sid;
      cache_set('birt_reports_auth_sessions', $birt_reports_auth_sessions->data, 'cache', CACHE_PERMANENT);
    }
  }
}

What this code is doing is basically encrypting a message string of the form session_id={session_id}&op={login|logout}&timeout={timeout} and sending this over to the reporting server. The reporting server will decrypt this using the same key that was used to encrypt it, and use whatever parameters it needs.

The (MySQL) ‘create statement’ for the database table used for mapping session ids to uids is given below:

delimiter $$

CREATE TABLE `birt_reports_sessions` (
 `uid` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'User’s Uid',
 `sid` varchar(255) NOT NULL DEFAULT '0' COMMENT 'Session ID',
 PRIMARY KEY (`uid`,`sid`),
 KEY `sid_idx` (`sid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Tracks authenticated sessions to be authorized with...'$$

Part 3-2 of this series describes the Java Module.

, , , , , ,

2 Comments

Building a Single Sign-On Module for the BIRT Report Viewer – Part 2

This is the second post of the BIRT SSO Series wherein I describe the implementation of a single sign-on module for the Eclipse BIRT Report Viewer. This post gets straight into the details of server configuration. It is recommended that you first read the introduction in Part 1 to get acquainted with the background and the premises on which this solution is built.

Part 2: Server & Environment Configuration

I had noted in Part 1 that I hosted my report server under a sub-path of the top level domain. For this there needs to be a form of inter-process communication enabled via mod_jk in order for Apache to pipe requests and responses to and from Tomcat. mod_jk is easy to compile from source, if your particular Linux distribution does not happen to supply it from its package repository.

You’ll need the apxs tool in order to compile the extension. On a Fedora system, this is available in the httpd-devel package. Once you’ve downloaded and extracted the tomcat-connectors source bundle, cd into the native folder and issue the command:

$ ./configure --with-apxs=/usr/sbin/apxs
$ make

Then copy the apache-2.0/mod_jk.so file into /usr/lib[64]/httpd/modules. Edit your httpd.conf file and add the following lines:

LoadModule jk_module modules/mod_jk.so
JkWorkersFile conf/workers.properties

Then create a new file /etc/httpd/conf/workers.properties and add the following lines:

worker.list=worker1
worker.worker1.port=8009
worker.worker1.host=localhost
worker.worker1.type=ajp13

This configuration assumes that your Tomcat server is running on the same machine as Apache, but it is not a necessary condition. I’m running my Drupal application under a vhost and so the JkMount directive is placed inside the vhost directive. If your application is deployed directly, then it should go into the workers.properties file described above.


    ServerName yourdomain.com
    ...
    ...
    JkMount /birt/* worker1

Your Tomcat CATALINA_BASE/server.xml file should contain the following lines:

<!-- Define a SSL HTTP/1.1 Connector on port 8443      This connector uses the JSSE configuration, when using APR, the      connector should be using the OpenSSL style configuration      described in the APR documentation -->
<Connector port="8443" protocol="org.apache.coyote.http11.Http11NioProtocol" SSLEnabled="true" maxThreads="150" scheme="https" secure="true" keystoreFile="${user.home}/.keystore" keystorePass="changeit" clientAuth="false" sslProtocol="TLS" />

<!-- Define an AJP 1.3 Connector on port 8009 -->
<Connector port="8009" protocol="AJP/1.3" redirectPort="8443" enableLookups="false"/>

Your tomcat-users.xml file should have the manager and admin roles defined, something like:

<tomcat-users>
  <role rolename="manager"/>
  <role rolename="admin"/>
  <user username="root" password="password" roles="admin,manager"/>
</tomcat-users>

Finally, the BIRT SSO module requires that the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files be downloaded and made available to the JRE on which it will be run. For Java 1.7, the policy files are available at: http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html.

NOTE: Although the instructions tell you to install the jar files in JAVA_HOME/lib/security in case you’re running tomcat on a JDK, they must actually be put in JAVA_HOME/jre/lib/security. In case you’re running on a JRE directly, the instructions on the site should work.

This concludes the server and environment setup required for the module to work. Part 3 of this series delves into the details of the module implementation.

, , , , , , , , , ,

3 Comments

Building a Single Sign-On Module for the BIRT Report Viewer – Part 1

This is the first post of a series in which I attempt to lay down a scheme for integrated authentication (or single sign-on) between the Eclipse BIRT Report Viewer and some other independent web application.

Part 1: Introduction

The Eclipse BIRT project has long been one of the best open source alternatives for anyone looking to hook up a powerful web-based reporting component to drive their business analytics (Here’s a useful comparison matrix for other open source reporting platforms).

There are several integration points available that you can use to tie in the reporting engine to your existing applications. Arguably the easiest way to do it, is to just deploy the BIRT Report Viewer webapp onto a server like Tomcat or Jetty. The application is best described on the website itself (http://www.eclipse.org/birt/deploy/#viewer):

The BIRT Viewer can be used in a variety of ways:

  • As a stand-alone, pre-built web application for running and viewing reports.
  • As a starter web application that you can customize to your needs.
  • As an example for learning how to build your own reporting web application, or to learn how to use the BIRT engine.
  • As a way to run a report using a URL. This option provides a simple way to integrate BIRT reporting into applications build using non-Java technology such as Perl, PHP or even static web pages.

The BIRT viewer is a web application included with BIRT to perform the report preview operation within Eclipse. It is also a sample of how to integrate birt with a web application.

The webapp is fully functional and capable of running any BIRT report, including drill-downs, sub-reports and interactive charts. It also supports runtime parameter entry through a nice AJAX interface. There are built-in handlers for exporting the rendered report or the raw data into various formats, including, but not limited to PDF, DOC and XLS. There’s also support for client-side and server-side printing, pagination and TOC. The interactive portions of the UI are fully AJAX-driven.

Being this feature-rich means there is usually little to no functional enhancement required, for the application to be usable in a production environment right out of the box. Not bad for a “sample” app!

The only thing it lacks is application-managed authentication. I see this as an advantageous omission, since the developer (or system integrator) would actually need the flexibility to decide upon or design the authentication mechanism to use, depending on which other system(s) this reporting app is being integrated with. More often than not, one would look for a single sign-on solution that lets users seamlessly switch between the reporting context and any other business context that the application ecosystem provides.

Outlined in this series of posts is one such scheme that I designed for a hybrid system comprising a PHP application providing the most of the functionality (including the user management) and the BIRT  Report Viewer as the reporting component. Users would login through the PHP application, and this step would serve to authenticate them to the report server as well.

DISCLAIMER: I do not claim to be a web security expert, and there could be (and probably are) vulnerabilities in the scheme that I describe hereon. If you’re planning to use this scheme for your own deployment, please take care to thoroughly examine it first for any security holes. I am not responsible if  your decision to adopt my method leaves your system(s) vulnerable to attack. If you do find something that ought to be rectified, please leave a comment describing the problem, and if possible, a solution.

Now that we’ve got the niceties out of the way, on to the details – starting with a description of the specific ecosystem for which this scheme was originally designed:

  • A Drupal (PHP-based CMS) application running on an Apache 2.2 server, say at http://yourdomain.com.
    • This is the only application in the hybrid system which has access to (persistent) user data, and is the primary site through which users log in.
    • Upon login, this application sets a cookie in the user’s browser which will later be required by the report server as well. This places a restriction on where (URL) the report server can be located. It can either be at http://yourdomain.com/<report root> or at http://<report root>.yourdomain.com[:port]. In the latter case, the cookie domain must be set to .yourdomain.com in order for the cookie to be valid for both the top level domain and subdomain. (CAUTION: This makes the cookie valid for ANY subdomain under yourdomain.com, meaning the browser will send it across for any request made to any URL of the form <subdomain>.yourdomain.com[:port]/*, including possibly those which should not be privy to it.)
  • The BIRT Report Viewer running on Apache Tomcat 7.0.
    • In my case the report component was accessible at http://yourdomain.com/<report root>. This needs the Apache-Tomcat connector module (mod_jk) to be enabled. The configuration is explained later.
    • The report viewer does not have direct access to the user master data. Instead it relies on server-to-server communication, taking place in the backend when a user logs in, for authentication data. This data is stored in memory for the lifetime of the session, which can be expired by either a session timeout, a user logout event, or a server (Tomcat) shutdown.

That’s it for the introduction. Part 2 of this series delves into the details of the server configuration.

, , , , , , , , ,

3 Comments

Syncing with bitpocket – a flexible, open source alternative to Dropbox

This continues from my previous post on the various online storage/sync solutions available today.

I’ve been a Dropbox (and Box, and Google Drive) user for a while now, and like it for its convenience. It is easy to use and setup, and lets you keep multiple devices in sync with next to no effort. However, I’ve always had some concerns over privacy and  security issues. In light of the recent attack on the service provider, I started wondering how safe my files and accounts really are (not just with Dropbox, but actually with any online storage solution, including a home-brewed one).

I also have some concerns regarding the privacy of my documents. Say, I’ve got some sensitive data uploaded to an online storage service. Who’s to say these documents are safe from data mining, or (god forbid) human eyes? (I’m not pointing fingers at any individual storage provider here. Some may respect your privacy, others may not.) Many people would be extremely wary of the possibility of information harvesting (even if it is completely anonymized and automated) and/or leakage.

Then of course, there are some less critical, but nevertheless important limitations:

  1. Only x GB of (free) storage space. One can always upgrade to a paid package, but I don’t want to pay for 50 GB of storage when I’m only going to use 10 GB in the foreseeable future. There are services who provide a large amount of storage space for free, but most of them still charge you for bandwidth usage above a fraction of the amount.
  2. No support for multiple profiles. You have to put EVERYTHING you want to sync under one single top-level folder. This may not be a suitable or acceptable restriction in all situations.
  3. Lack of flexibility – you don’t get to move your repository around if you need to. Once you subscribe to a service, you’re locked into using their storage infrastructure exclusively.

It is not necessary that the limitations I’ve described so far are all present in any single service, or even that they are a matter of concern for everybody. These are just a few issues that got me going on a personal quest to find a better alternative.

There are actually quite a few ways of setting up your own personal online storage and sync solution, whose security is limited only by your ability to configure it. But the most visible benefit over any existing service is the flexibility –

  1. to use a storage infrastructure of your choice, and
  2. to manage multiple profiles.

The rest of this post documents my experiments with one such solution, named bitpocket. It performs 2-way sync by using a wrapper script to run rsync twice (once on the master, once on the slave). It can also detect, and correctly propagate file deletions. It does have one limitation in that it doesn’t handle conflict resolution. You have been warned. (Unison is supposedly capable of this, but that is another post ;-).)

The basic setup instructions are right on the project landing page. Follow them and you’re all set. I’ll elaborate on two things here –

  1. how to do a multi-profile setup, and
  2. how to alleviate the problem of repeated remote lockouts when multiple slaves always try to sync at the same time.

Multiple profiles

I’ve got two folders on my laptop that I want to sync:

  1. /home/aditya/scripts
  2. /home/aditya/Documents

I want these two folder profiles to be self-contained, without requiring the tracking to be done at the common parent. Following the instructions on the project page, I did a bitpocket init inside each of the above folders. On the master side (I’m running an EC2 micro-instance on a 64-bit Amazon Linux AMI), I’ve got one folder: /home/ec2-user/syncroot where I want to track all synced profiles. So in the config file of the individual profile folders on the slave machine I set the REMOTE_PATH variable as follows:

  1. For /home/aditya/scripts
    REMOTE_PATH="/home/ec2-user/syncroot/scripts"
  2. For /home/aditya/Documents
    REMOTE_PATH="/home/ec2-user/syncroot/Documents"

That’s it! You can manage as many profiles as you want, with each slave deciding where to keep its local copy of each profile.

Preventing remote lockouts

Say, all your slaves are configured to sync their system clock over a network source. They are in sync with each other, often to the second (or finer). Now if all crons are configured to run at 5 minute intervals, then all the slaves attempt to connect to the master at exactly the same time. The first one to establish a connection starts syncing, and all the others get locked out. This happens on every cron run. The problem is further exacerbated by the fact that even blank syncing takes a few seconds at the very least, and the lockout is in force for that duration. We’re thus left with a very inefficient system which can sync ONLY one slave with every cron run. If one slave is on a network that enjoys consistently lower lag with the master than all the others, then the others basically never get a chance to connect! Even if that is not the case, the system overall always has a success rate of 1/N for N slaves, in each cron run. Not good.

One way to alleviate this (though not entirely) is to introduce a random delay (less than the cron interval) between when cron initiates and when the connection is actually attempted. Over several cron runs, this scheme spreads out the odds evenly (duh!), for each slave, of running into a remote lockout. Local lockouts are not a problem. Bitpocket uses a locking mechanism to prevent two local processes from syncing the same tracked directory at the same time. If a new process encounters a lock on a tracked directory, meaning the previously spawned process hasn’t finished syncing yet, it simply exits. The random delay is introduced as shown below (assuming a cron frequency of 5 min):

#! /usr/bin/env bash

cd $1
PIDFILE="$1/.bitpocket/run.pid"

sleep $[ ( $RANDOM % 300 ) ]s

if [ -e "${PIDFILE}" ] && (ps -u $USER -f | grep "[ ]$(cat ${PIDFILE})[ ]"); then
  echo "Already running."
  exit 99
fi

rm -rf .bitpocket/tmp/lock #Previously spawned proc is now dead. There should be no lock at this point. This step corrects for an unclean shutdown.
/usr/bin/bitpocket cron &

echo $! > "${PIDFILE}"
chmod 644 "${PIDFILE}"

That’s it! Assuming you’ve saved this file in /usr/bin/bpsync, edit your crontab entries like so, and you’re done:

*/5 * * * *     bpsync ~/Documents
*/5 * * * *     bpsync ~/scripts

Happy syncing!

EDIT: I ran into trouble with stale server-side locks preventing further syncs with any slave. This happens when a slave disconnects mid-sync for whatever reason. Lock cleanup is currently the responsibility of the slave process that created it. There is no mechanism on the server to detect and expire stale locks (See https://github.com/sickill/bitpocket/issues/16). This issue needs to be fixed before this syncing tool can be left to run indefinitely, without supervision.

EDIT #2: One quick way to dispose of stale master locks is by periodically running a little script on the server that checks each sync directory for any open files (i.e. some machine is currently running a sync). If none are found, it simply deletes the leftover lock files. The script and the corresponding crontab entries are as below:

#!/bin/bash

cd ~/syncroot
for DIR in *;
do
 OUT=`/usr/sbin/lsof +D $DIR`
 if [ "$OUT" = "" ];
 then
  rm -rf $DIR/.bitpocket/tmp/lock
 fi
done
*/5 * * * * /usr/bin/cleanup.sh

, ,

3 Comments

Optimizing apache (httpd) for a development setup

The apache configuration file (usually httpd.conf) that ships with most linux distros has some configuration settings that are sub-optimal for running PHP applications (many optimizations described below will benefit other applications as well). I’ve found, after a bit of trial and error, that the following settings will let you squeeze a little extra out of the server running on your development machine.

The settings below affect the overall memory usage (even when idle) of the server since they define the minimum number of httpd processes (or threads) that are spawned and maintained. On my laptop, with 4GB RAM, the prefork and mpm worker settings are as follows (apache only uses one of them, depending on how it is compiled):

<IfModule prefork.c>
    StartServers 8
    MinSpareServers 5
    MaxSpareServers 20
    ServerLimit 50
    MaxClients 50
    MaxRequestsPerChild 300 #Respawn a process after serving 300 requests. Curtails footprint bloat due to memory leaks.
</IfModule>

<IfModule worker.c>
    StartServers 2
    MaxClients 50
    MinSpareThreads 25
    MaxSpareThreads 50
    ThreadsPerChild 25
    MaxRequestsPerChild 300
</IfModule>

Other than tweaking the process/thread variables, there are a few more settings to look at:

KeepAlive On #Reduces page load time by reusing tcp connections
MaxKeepAliveRequests 100
KeepAliveTimeout 15 #Set this value higher for longer-lived connections between browser and server, but if there are many simultaneous requests, this can cause some of them to unnecessarily queue up.
FileETag None

Additionally, you can instruct apache to compress content sent over the network using gzip by adding the following lines (I’ve put them in a separate mod_deflate.conf under /etc/httpd/conf.d, but that’s not necessary):

#GZIP configuration for enabling compresion
SetOutputFilter DEFLATE
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4\.0[678] no-gzip
BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html
SetEnvIfNoCase Request_URI \
\.(?:gif|jpe?g|png)$ no-gzip dont-vary
Header append Vary User-Agent env=!dont-vary

Assuming you’re doing PHP development, you might as well go ahead and enable APC. I’ve sometimes faced an issue with APC using stale opcode even when a newer version of the source file is available, but not very often. Some people prefer to keep APC disabled on development machines. In /etc/php.d/apc.ini (on Red Hat-like systems, including Fedora, CentOS, RHEL, Amazon Linux, etc.) enable the following settings:

apc.enabled=1
apc.shm_size=512M
; This is usually too much if you're running just one application.
; It's better to start with 128 MB and upscale only if this falls short.
; You can check APC memory usage using the apc.php file that ships with the default APC installation on most systems.
; (http://pecl.php.net/package/APC)

While we’re on the subject of optimization, we might as well tune our MySQL server to use its built-in query cache. In your /etc/my.cnf add the following line:

query_cache_size=64M

Restart your apache and mysql servers, and enjoy the speed!

, , ,

1 Comment

%d bloggers like this: