CF8 PDF Manipulation: Pulling Text Out

So, this morning a friend called me up with a problem. They had received some PDF files from their insurance company, and they needed the data in Word or Excel for manipulation. Now, they could cut and paste the information, but this was time consuming. She went to the Adobe site, trying to find info, and saw 'ColdFusion' on the homepage. This sparked her brain, because she immediately went, "Hey, Cutter does something with ColdFusion! Maybe he can help me!"

Lucky for her, we now have ColdFusion 8, with it's built-in PDF support through the use of the CFPDF tag. I had to do a tiny bit of research on this, because Adobe's CF LiveDocs weren't overly clear, but I eventually found out that I could extract text with some very simple DDX processing directives.

Ray did a series of posts recently about working with PDF documents. Although none of them answered my question directly, he had written one about using the DDX processing directives. This sent me searching the Adobe site for more information, which is where I came upon the Understanding DDX developer documentation. Basically, by rewriting Ray's simple example, I was able to extract all of the DocumentText from the PDF and dump it into an XML file. First I need the DDX, which is just some simple XML:

view plain print about
1<cfsavecontent variable="myddx">
2<?xml version="1.0" encoding="UTF-8"?>
3<DDX xmlns="http://ns.adobe.com/DDX/1.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://ns.adobe.com/DDX/1.0/ coldfusion_ddx.xsd">
4    <DocumentText result="OutXML">
5        <PDF source="Title"/>
6    </DocumentText>
7</DDX>
8</cfsavecontent>
9<cfset myddx = trim(myddx)>

Then, I verify the validity:

view plain print about
1<cfif isDDX(myddx)>
2yes, its ddx
3<cfelse>
4no its not
5</cfif>

Now, a little explanation. Looking at the DDX, you'll notice I've defined a result and a source. I had tried to define my file names here directly, but ColdFusion didn't like that when I hit the CFPDF tag. Apparently, when using the processddx action of the tag, you are required to define your inputfiles and outputfiles. Further study of the LiveDocs shows that ColdFusion is expecting structures for these defininitions. So, the DDX references certain structure keys (OutXML and Title) which you must define prior to processing your pdf.

view plain print about
1<cfset inputStruct = StructNew() />
2<cfset inputStruct.Title = "rptLauncher2.pdf" />
3
4<cfset outputStruct = StructNew() />
5<cfset outputStruct.OutXML = "words2.xml" />

You now have all of the necessary pieces. All that's required is your call to process your DDX directives.

view plain print about
1<cfpdf action="processddx" ddxfile="#myddx#" name="VARIABLES.doc" inputfiles="#inputStruct#" outputfiles="#outputStruct#" />

I CFDump the VARIABLES.doc to see my success or failure, which comes out just fine. I now have a file, words2.xml, sitting in my server's folder, which contains all of the content of the PDF file. Simple and sweet.

CF8 Ajax Grid: Renderers and Events

So, I was doing a real quick, down and dirty form and results app for something internal. Way temporary, with little scale-out, I wrote a form and processor, then used the CF8 DataGrid for the results display. Problem was, two of the fields were textareas that could contain a lot of info, so I needed a quick way to show and expanded details set. Now, had I been using ExpanderRow plugin, but this was just quick implementation prototyping type stuff.

What I needed was a column of icons that I could then link to a CFWindow with the total display. Now, I have to use a Cell Renderer to place the image in the empty column, but first I need the column.

view plain print about
1<cfgridcolumn name="Details" header="" width="25" display="true" />

After that, I create a basic Cell Renderer:

view plain print about
1setDetailButtonRenderer = function(grid,cm,col){
2        cm.setRenderer(col,function(value,p,r,ind){
3            var retVal = "<img src='/resources/images/icons/book_link.gif' width='16' height='16' alt='Details' />";
4            return retVal;}
5        });
6        grid.reconfigure(grid.getDataSource(),cm);
7    }

This didn't entirely work out, as it placed the image in every row, even if there wasn't a record. So, time to improvise. I adjust to see if there's value for a cell in this row's 'record', to determine whether I need the image.

view plain print about
1setDetailButtonRenderer = function(grid,cm,col){
2        cm.setRenderer(col,function(value,p,r,ind){
3            var ds = grid.getDataSource();
4            var theRecord = ds.getAt(ind);
5            if(theRecord.get('TS') != null){
6                var retVal = "<img src='/resources/images/icons/book_link.gif' width='16' height='16' alt='Details' />";
7                return retVal;
8            }
9        });
10        grid.reconfigure(grid.getDataSource(),cm);
11    }
12
13    function showRecWin(){
14     ColdFusion.Window.show('winDetails');
15 }

Alright, to call the renderer into play I have an init method that is fired by the CF ajaxOnLoad() method.

view plain print about
1init = function(){
2        var repGrid = ColdFusion.Grid.getGridObject('reportsGrid');
3        var repCM = repGrid.getColumnModel();
4
5        setDetailButtonRenderer(repGrid,repCM,8);
6    }

Now we're halfway there. Next I need to get a 'click' on the image cell. You do this by accessing the underlying Ext functions of the Grid object itself, for which you already have a reference (repGrid).

view plain print about
1init = function(){
2        var repGrid = ColdFusion.Grid.getGridObject('reportsGrid');
3        var repCM = repGrid.getColumnModel();
4
5        setDetailButtonRenderer(repGrid,repCM,8);
6
7        repGrid.on('cellclick',function(grid,rowIndex,columnIndex,e){
8            if(columnIndex==8){
9                
10            }
11        });
12    }

We are configuring an on cellclick function here, which is really a listener on the row itself. We further narrow it to only perform action if the column that the cursor was in 'on click' was our Details column, which is the 9th column of our grid, including hidden columns (remember that this uses a JavaScript array, which starts with zero, so the column you reference is always column count minus one).

Next thing we need is a quick modal pop-up for our 'Details.' CFWindow makes a great candidate for this.

view plain print about
1<cfwindow name="winDetails" title="Details" draggable="false" resizable="false" initShow="false" height="600" width="600" />

It's invisible when initialized, because we only want to show it 'on click'. We need a quick method for 'showing' the window.

view plain print about
1function showRecWin(){
2     ColdFusion.Window.show('winDetails');
3 }

We can now reference this in our 'on click' function.

view plain print about
1repGrid.on('cellclick',function(grid,rowIndex,columnIndex,e){
2        if(columnIndex==8){
3            showRecWin();
4        }
5    });

OK, we get our window, but now we need some data. Now, I could do an ajax call for the data, but it's already in my cell. It's just too long to easily display in the grid. Rather than do another server call, I'll just query the grid's Data.Store for the information.

view plain print about
1repGrid.on('cellclick',function(grid,rowIndex,columnIndex,e){
2        if(columnIndex==8){
3            showRecWin();
4            // This empties out any previously displayed content
5            document.getElementById("winDetails_body").innerHTML = "";
6            var ds = grid.getDataSource();
7            var theRecord = ds.getAt(rowIndex);
8            var valPurpose = theRecord.get('FEATUREPURPOSE');
9            var valFunction = theRecord.get('FEATUREFUNCTION');
10            document.getElementById("winDetails_body").innerHTML = "<b>Purpose:</b><br />" + valPurpose + "<br /><br /><b>Function:</b><br />" + valFunction;
11        }
12    });

Really simple, as long as you remember that ColdFusion's creation of the grid's ColumnModel will uppercase all of your cfgridcolumn's name attributes.

That's it. Really doesn't take a whole lot. A little digging in the documentation for the 1.1.1 version of the ExtJS library will give you a ton of information.

CFGrid Gotcha

So, I'm finally playing with some of the new Ajax controls built into ColdFusion 8. They're based on ExtJS (for the most part), and I thought it would be cool to dig in and see what I could do.

So, pulled up the documentation. First I built a basic CFC, with a remote access method that pulls all of the records from the Art table of the cfartgallery db. Then I built the display page, with the cfgrid and cfgridcolumn tags. I used the bind attribute to bind the grid to the cfc method. Tried it out and...error.

view plain print about
1CFGRID: Response is empty [Enable debugging by adding 'cfdebug' to your URL parameters to see more information.]

OK. Fun. No response messages showing in Firebug, but the right parameters were getting passed through. Google is your friend, right? One reference that I could find, in the comments on a post at Ben's site, but it only pointed me towards the Application.cfc, with no explanation on what the problem was or how to fix it.

So, I changed the file name of my Application.cfc. I didn't need it this early in the game, so I took it out of play. Voila! It works. OK, so what's in the Application.cfc?

Well, I had already commented out the onError method (figuring out an issue with Coldspring). There wasn't any output in any of the methods. I went over all of my attributes and mappings...nothing. Then I noticed something.

I took the Application.cfc template from Ray's site, with very minor adjustments. I finally noticed that one function didn't have an 'output' attribute, onRequest.

view plain print about
1<cffunction name="onRequest" returnType="void">
2        <cfargument name="thePage" type="string" required="true" />
3        <cfinclude template="#arguments.thePage#" />
4    </cffunction>

Once I commented this function out the call worked perfectly. Well, lessons learned...

ColdFusion 8 Fun: Looping Files

OK, so I've been working on my mother's website for...well, too long. One of the reasons is I've been waiting on her to get approval to get a feed of listings, so we can put them directly on her site. Well, she finally got the approval, so I've been having fun this weekend, pulling in data and images, setting up database tables. The Works.

These feeds are tab delimited text files. The first line being a listing of all of the columns, with all of the rest being the data. So, I set up a staging table, with column names that match those in the file (luckily they provide a listing of the columns, along with their data type and length, in a separate .log file). Next, I used the Illudium PU-36 Code Generator to quickly give me some data access objects, and then settled down to write a little code.

Now, my first file has 7,000+ records in it, so I go ahead and give myself a little time for the code to do it's job.

view plain print about
1<cfsetting enablecfoutputonly="true" requesttimeout="600" />

Next thing I wanted were a few variables and objects to work with.

view plain print about
1<cfset VARIABLES.lineNum = 1 />
2<cfset VARIABLES.filePath = expandPath(".") & "\myFile.txt" />
3<cfset VARIABLES.Bean = CreateObject("component","feedRecord") />
4<cfset VARIABLES.DAO = CreateObject("component","feedRecordDAO").init(APPLICATION.dsn) />

And then I setup the loop on the file.

view plain print about
1<cfloop file="#VARIABLES.filePath#" index="VARIABLES.line">
2    <!--- Code to go here --->
3</cfloop>

OK, for those who don't know, the DAO object that is created by the code generator takes a bean object as the argument for the save() method. The bean object has an init() method with all of the column names as non-required arguments. So, how to best initialize my bean? Well, the data file's first row is a tab delimited list of the column names, so I decide to use it. First, I only want the first row to give me a data structure of the column names, in the order I'll need them. Hmmm? Ok, I decide to use an Array.

view plain print about
1<cfloop file="#VARIABLES.filePath#" index="VARIABLES.line">
2    <cfif VARIABLES.lineNum gt 1>
3        <!--- This is for later --->
4    <cfelse>
5        <cfset VARIABLES.propOrder = ArrayNew(1) />
6        <cfset VARIABLES.lineCount = 1 />
7        <cfloop list="#VARIABLES.line#" index="VARIABLES.listItem" delimiters="#Chr(9)#">
8            <cfset VARIABLES.propOrder[VARIABLES.lineCount] = VARIABLES.listItem />
9            <cfset VARIABLES.lineCount++ />
10        </cfloop>
11        <cfset VARIABLES.lineCount = 0 />
12    </cfif>
13    <cfset VARIABLES.lineNum++ />
14</cfloop>

Notice that the first part of my flow control is currently blank. This area I left at the beginning, as most lines will meet this criteria, and that's where the meat of the processing will be handled in the end. This Array, though very important, is only handled on the first row of the file. It will process first, because of the way the flow control is written, but bypassed throughout the rest of the process. BTW, I love the JS style operators;)

Now, I used an Array to maintain the order of the key names, but ultimately I'll need a Struct to pass into the bean's init() method, as an argumentCollection.

view plain print about
1<cfloop file="#VARIABLES.filePath#" index="VARIABLES.line">
2    <cfif VARIABLES.lineNum gt 1>
3        <cfset VARIABLES.resProp = StructNew() />
4    ....

Now, I was going to list loop through each line to set my Struct, but found out the hard way that <cfloop> still doesn't like empty items in a string. I was getting errors all over the place about truncated data and what, before I noticed data wasn't in the right place. What to do? Take a different approach! Instead of looping a list, I'll loop an Array, and make my Array from the list, while using the new includeEmptyFields option.

view plain print about
1<cfloop file="#VARIABLES.filePath#" index="VARIABLES.line">
2    <cfif VARIABLES.lineNum gt 1>
3        <cfset VARIABLES.resProp = StructNew() />
4        <cfset VARIABLES.arrProps = ListToArray(VARIABLES.line,Chr(9),true) />
5        <cfloop from="1" to="#ArrayLen(VARIABLES.propOrder)#" index="VARIABLES.itemCount">
6            <cfset VARIABLES.resProp[VARIABLES.propOrder[VARIABLES.itemCount]] = VARIABLES.arrProps[VARIABLES.itemCount] />
7        </cfloop>
8</code>
9
10Did you see it? Simple, eh? Now I have a Struct, where the data from each line matches up with the keys set from the first line of the file. All that's left is to set my bean and pass it to the save() method of the DAO.
11
12<cfloop file="#VARIABLES.filePath#" index="VARIABLES.line">
13    <cfif VARIABLES.lineNum gt 1>
14        <cfset VARIABLES.resProp = StructNew() />
15        <cfset VARIABLES.arrProps = ListToArray(VARIABLES.line,Chr(9),true) />
16        <cfloop from="1" to="#ArrayLen(VARIABLES.propOrder)#" index="VARIABLES.itemCount">
17            <cfset VARIABLES.resProp[VARIABLES.propOrder[VARIABLES.itemCount]] = VARIABLES.arrProps[VARIABLES.itemCount] />
18        </cfloop>
19        <cfoutput>Saving Record ## #VARIABLES.lineNum#. </cfoutput>
20        <cfset VARIABLES.Bean.init(argumentCollection:VARIABLES.resProp) />
21        <cftry>
22            <cfif VARIABLES.DAO.save(VARIABLES.Bean)>
23                <cfoutput>Record saved.<br /></cfoutput>
24            <cfelse>
25                <cfoutput>Error saving record.<br /></cfoutput>
26                <!--- custom cfthrow here --->
27            </cfif>
28            <cfcatch type="any">
29                <!--- and a custom error handler here --->
30            </cfcatch>
31        </cftry>
32        <cfflush />
33    <cfelse>
34        <cfset VARIABLES.propOrder = ArrayNew(1) />
35        <cfset VARIABLES.lineCount = 1 />
36        <cfloop list="#VARIABLES.line#" index="VARIABLES.listItem" delimiters="#Chr(9)#">
37            <cfset VARIABLES.propOrder[VARIABLES.lineCount] = VARIABLES.listItem />
38            <cfset VARIABLES.lineCount++ />
39        </cfloop>
40        <cfset VARIABLES.lineCount = 0 />
41    </cfif>
42    <cfset VARIABLES.lineNum++ />
43</cfloop>
44<cfsetting enablecfoutputonly="false" />

That's it! Nothing to it! Now, there are probably better ways, and half of this should be encapsulated even further, and it will break if the feed provider changes the column names. But, hey, it was fun! Right?

Example code included below with the Download link.

Local Development Setup Pt 4: ColdFusion + Apache + SSL

In previous posts we setup ColdFusion on Apache, created multiple ColdFusion instances, and created Virtual Directories to remote, UNC pathed resources. What's left? Well, what if you need to test SSL secure pages? Perhaps you have areas of your sites that need to have secure encryption, where you're harvesting personal information from your users. No one feels comfortable submitting personal information online if they don't see that little lock in the bottom of their browser. You, as a developer, want to be able to code this functionality, without the need to test it in your production environment.

With a little work, setting up a secure site within Apache is relatively simple. You already completed your first step when you installed ColdFusion and Apache, because the Apache version you installed was precompiled for SSL. With a few more steps you'll be on your way.

Download the openssl.cnf.txt file from the Download link below, and place the file in your Apache bin directory (C:\Program Files\Apache Group\Apache2\bin). Then, rename the file, removing the .txt extension. After you've done this, you may not see the remaining .cnf extension in your file browser, and it may say that it's a SpeedDial file type. That's OK, it's supposed to look that way. The next thing you need to do is copy the ssleay32.dll and lebeay32.dll files from your bin folder into your Windows\System32 folder. Make sure you copy the .dll files and not the .lib files. Now you're ready to create your personal security certificates.

Open a command prompt and navigate to your bin folder. Once there you can begin to use the openssl executable to create your certs. You will need one for each secure site you configure. Here we'll create one for secure.companyname.loc, by executing the following commands in your console.

view plain print about
1openssl req -config openssl.cnf -new -out secure.csr

The .csr file can have any name, but I've named it like this so I know that it's associated with my 'secure' domain. Note that you must create a certificate for each fully qualified domain name that you wish to be secure. The web browser will scream if the domain names don't match exactly. Here is a step by step of what you should see, with my responses bracketed by percentage signs.

view plain print about
1Loading 'screen' into random state - done
2Generating a 1024 bit RSA private key
3........++++++
4.......++++++
5writing new private key to 'privkey.pem'
6Enter PEM pass phrase: %my-made-up-pass%
7Verifying - Enter PEM pass phrase: %my-made-up-pass%
8-----
9You are about to be asked to enter information that will be incorporated into your certificate request.
10What you are about to enter is what is called a Distinguished Name or a DN.
11There are quite a few fields but you can leave some blank
12For some fields there will be a default value,
13If you enter '.', the field will be left blank.
14-----
15Country Name (2 letter code) []:%US%
16State or Province Name (full name) []:%mystate%
17Locality Name (eg, city) []:%mycity%
18Organization Name (eg, company) []:%companyname%
19Organizational Unit Name (eg, section) []:%mydept%
20Common Name (eg, your websites domain name) []:%secure.companyname.loc%
21Email Address []:%username@companyname.com%
22
23Please enter the following 'extra' attributes
24to be sent with your certificate request
25A challenge password []:%my-made-up-pass%

This will create the .csr file. Now, on to the next step, the private key file.

view plain print about
1openssl rsa -in privkey.pem -out secure.key
2Enter pass phrase for privkey.pem:%my-made-up-pass%
3writing RSA key

Ok, now that we have a private key all that's left is to get a certificate.

view plain print about
1openssl x509 -in secure.csr -out secure.cert -req -signkey secure.key -days 365
2Loading 'screen' into random state - done
3Signature OK
4subject=/C=US/ST=mystate/L=mycity/O=companyname/OU=mydept/CN=secure.companyname.loc/emailAddress=username@companyname.com
5Getting Private key

Alright, now you have your certificate for your 'secure' domain. Create, within your Apache conf folder, two new folders ssl.cert and ssl.key, and move your secure.cert and secure.key files into their respective folders. You may also delete the .rnd file from your Apache bin folder. This file contains entropy information for creating the key and could be used for cryptographic attacks against your private key. Although this isn't likely within your local development environment, it is still good practice.

As this is for your local environment, this is a simple way of creating a self-signed certificate for development use. All you have to do is install the certificate in your browser the first time you come to a secure page. Also note that this certificate expires after a year, and you can increase the -days 365 if you want.

Now we start getting into actually configuring your server for your SSL connection. First you will want to remove the comment hash (#) from the LoadModule line for ssl_module modules/mod_ssl.so. This is generally the last line of the LoadModule descriptors in your httpd.conf file.

view plain print about
1# SGB: [072408]: Enabling SSL
2LoadModule ssl_module modules/mod_ssl.so

Next you'll find the IfModule block below:

view plain print about
1<IfModule mod_ssl.c>
2    Include conf/ssl.conf
3</IfModule>

And add a few necessary lines:

view plain print about
1<IfModule mod_ssl.c>
2    Include conf/ssl.conf
3</IfModule>
4# SGB [072408]: Some added config for our SSL
5SSLMutex default
6SSLRandomSeed startup builtin
7SSLSessionCache none
8ErrorLog logs/ssl.log
9LogLevel info

It is very important that you move this descriptor block below the JRun Settings descriptor block. When you define which instance serves your secure pages you want it to know the JRun is needed.

The next step took a great deal of trial and error to get straight. First, make a backup copy of the ssl.conf file. Then, within the original file, we're going to make several changes. First, comment (with a hash sign [#]) the opening and closing IfDefine tags near the top and very bottom of the file.

view plain print about
1# SGB [072408]: Removed for proper load
2#<IfDefine SSL>
3    ....
4# SGB [072408]: Removed for proper load
5#</IfDefine>

And, set it up for NameVirtualHost, just as you did within your httpd.conf, but with the correct port for SSL.

view plain print about
1# SGB [072408]: Enable NameVirtualHost configurations on SSL
2NameVirtualHost 127.0.0.1:443

Next, remove the entire VirtualHost block from the file. This is loaded with lines and lines of comments, is already in your backup file for later reference, and only confuses what is needed (and caused me errors somewhere anyway). We'll setup a 'secure' VirtualHost entry for your secure domain, using the certificate and key you created before.

view plain print about
1# SGB [072408]: 'secure' SSL domain setup directive
2<VirtualHost 127.0.0.1:443>
3    DocumentRoot "C:\Documents and Settings\username\My Documents\wwwroot\siteroot"
4    ServerName secure.companyname.loc
5    ServerAdmin username@companyname.com
6    ErrorLog logs/secure-ssl-error.log
7    TransferLog logs/secure-ssl-access.log
8    SSLEngine On
9    SSLCipherSuite ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP:+eNULL
10    SSLCertificateFile conf/ssl.cert/secure.cert
11    SSLCertificateKeyFile conf/ssl.key/secure.key
12    <FilesMatch "\.(cgi|shtml|phtml|cfm|cfc|php3?)$">
13        SSLOptions +StdEnvVars
14    </FilesMatch>
15    <Directory "C:\Documents and Settings\username\My Documents\wwwroot\siteroot">
16        SSLOptions +StdEnvVars
17    </Directory>
18    SetEnvIf User-Agent ".*MSIE.*" \
19             nokeepalive ssl-unclean-shutdown \
20             downgrade-1.0 force-response-1.0
21    CustomLog logs/ssl_request_log \
22             "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
23</VirtualHost>

OK, what's left? Oh yeah! We need a ColdFusion instance to associate it with (in this case the 'sites' instance). And, we'll probably need those Aliases too. Easy enough. Just add your Include statements.

view plain print about
1# SGB [072408]: 'secure' SSL domain setup directive
2<VirtualHost 127.0.0.1:443>
3    DocumentRoot "C:\Documents and Settings\username\My Documents\wwwroot\siteroot"
4    ServerName secure.companyname.loc
5    ServerAdmin username@companyname.com
6    ErrorLog logs/secure-ssl-error.log
7    TransferLog logs/secure-ssl-access.log
8    SSLEngine On
9    SSLCipherSuite ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP:+eNULL
10    SSLCertificateFile conf/ssl.cert/secure.cert
11    SSLCertificateKeyFile conf/ssl.key/secure.key
12    <FilesMatch "\.(cgi|shtml|phtml|cfm|cfc|php3?)$">
13        SSLOptions +StdEnvVars
14    </FilesMatch>
15    <Directory "C:\Documents and Settings\username\My Documents\wwwroot\siteroot">
16        SSLOptions +StdEnvVars
17    </Directory>
18    SetEnvIf User-Agent ".*MSIE.*" \
19             nokeepalive ssl-unclean-shutdown \
20             downgrade-1.0 force-response-1.0
21    CustomLog logs/ssl_request_log \
22             "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
23    Include conf/cf_sitesinstance.conf
24    Include conf/site_aliases.conf
25</VirtualHost>

That's it! Restart your Apache server and go to https://secure.companyname.loc (make sure your put a test 'index.cfm' in there). First you will be asked to accept the development certificate that you created, and then you should see your test message display, with the little lock down in the corner.

And that is how to configure ColdFusion (7 or 8) on top of Apache, in a multi-instance configuration, with virtual, UNC pathed directories, SSL support, and access to your instance administrators. Verify your instance settings, setup your Data Sources, fire-up CFEclipse, checkout from the Subversion repository, and get to writin' some code!


Resources: And a hellavalotta trial and error. No one, single post answered every issue (some didn't even answer one issue by itself), and so...here it is.

Local Development Setup Pt 3: Virtual Directories in Apache

In a previous post we touched upon the Alias statement in your VirtualHost directives. An Alias is a way of defining a virtual directory for your site. This is a pretty common practice in the web world. Your site code may be stored in your site's root (wwwroot) folder, but you may want to store your assets (images, video, stylesheets, etc.) outside your webroot. You might even want to store them on another server altogether. You'll want to access these items as if they're part of your site, like http://username.companyname.loc/Images/mypic.jpg or http://username.companyname.loc/Media/myvideo.flv.

Defining an Alias is easy, when dealing with your local file system. An Alias that points to a UNC path, in Windows, is another story. You have to use a UNC path because Apache will not allow you to utilize a mapped drive (it'll crash Apache, at least under 2.0). What took me a while to figure out was that you can not use Apache to control the directory access (as in permissions), you must do this through the user account access. You can not use the Directory Apache directives for access control. This also means that your Apache instance must run as the user who has access to the remote resource (I also setup CF instance to run as the same user).

I also discovered something else the hard way, Apache Aliases are case-specific. This means that you have to use double Aliases for each virtual directory that you define via an Alias. Remember, this is Windows, so code can go either way. You'll want to plan for either eventuality.

In our last post we defined a 'sites' instance, for the display of a mythical, templated application system. So, imagine that each site of that system would require it's own VirtualHost directive. Each one, though, would also be using shared media resources from the same location outside the webroot, like on a media server. For this, you can create another include in your Apache conf folder - site_aliases.conf, within which you could singularly define your remote virtual directories:

view plain print about
1Alias /Images/ //remote-serv1/Inetpub/wwwroot/Images/
2
3Alias /images/ //remote-serv1/Inetpub/wwwroot/Images/
4
5Alias /Video/ //remote-serv1/Inetpub/wwwroot/Video/
6
7Alias /video/ //remote-serv1/Inetpub/wwwroot/Video/
8
9Alias /Styles/ //remote-serv1/Inetpub/wwwroot/Styles/
10
11Alias /styles/ //remote-serv1/Inetpub/wwwroot/Styles/
12
13Alias /SiteSpecific/ //remote-serv1/Inetpub/wwwroot/SiteSpecific/
14
15Alias /sitespecific/ //remote-serv1/Inetpub/wwwroot/SiteSpecific/

Once you've defined these remote shared resources, you can then Include them in your site definitions in your VirtualHost directives:

view plain print about
1<VirtualHost 127.0.0.1:80>
2    ServerAdmin username@companyname.com
3    # Root folder for thisdomain.com, in the 'sites' instance
4    DocumentRoot "C:\Documents and Settings\username\My Documents\wwwroot\siteroot"
5    ServerName *.thisdomain.loc
6    # SGB [072007]: shared Alias Include
7    Include conf/site_aliases.conf
8    ErrorLog logs/thisdomain.loc-error.log
9    CustomLog logs/thisdomain.loc-access.log common
10    # SGB [072007]: Add include for the 'sites' ColdFusion instance
11    Include conf/cf_sitesinstance.conf
12</VirtualHost>

Once you've restarted your Apache instance you will be able to access these resources as part of the domain.

Local Development Setup Pt 2: Multiple ColdFusion Instances

Our last post discussed installing Apache and ColdFusion, as well as configuring your default instance for Apache access. Now it's time to create additional ColdFusion instances.

By default, ColdFusion (or, more appropriately, JRun) is only configured to utilize 512MB of RAM per instance, and is only capable of accessing 1024MB. This is due to a limitation of 32bit JVMs, and will someday be formerly addressed by Adobe. But that doesn't mean that you are necessarily restricted to only using 1GB of RAM for ColdFusion. You may define multiple instances of the server, each of which will address it's own memory space, it's own instance of the JVM, and it's own instance of ColdFusion (and JRun). Not only does this allow you to utilize more of the memory available to you today, in our high powered systems, but it will also sandbox applications that are separated into their own instances.

For instance: Let's say you have a dynamic template application. One that reads the requested URL and supplies customized content dependent upon the site identified. Any number of sites could be configured in a database, rendered by the same code, off of a single instance of ColdFusion (or a clustered set of instances, maybe). You could have a 'sites' instance of ColdFusion that served this content. Now, the same set of sites might require a backend administrator, or content management system, for the configuration of those sites. You might set this up on a single domain name, with users logging in to their specific set of tools and data. It would be it's own application, with dynamic options and data according to the user logging in. This might be placed in another 'control' instance of ColdFusion.

Setting up additional instances of ColdFusion is easy, but requires a small bit of manual effort when working with Apache. First of all, the connectors for JRun and Apache are not completed automatic, so you need to setup a few folders on the file system. Find the root directory for JRun. The default location is C:\JRun4. You are creating folder for the connectors, which will be located in the {JRun Root}\lib\wsconfig. Notice that there is already a subdirectory titled 1. This is the connector for your default ColdFusion instance. You'll want to create an empty subdirectory for each instance you setup, named exactly as you will name your instances. According to the above example, you want to create a 'sites' directory, and a 'control' directory.

Your next step requires logging into the ColdFusion Administrator of your default ColdFusion instance. In our last post we setup a url for accessing this, http://username.companyname.loc/CFIDE/Administrator. Once you've logged in, in the default instance (and only the default instance) you will see an option at the bottom of the menu for Enterprise Manager. You'll want to select this, and it's sub-item, Instance Manager. Here you will see a samples instance that is already defined, though disabled. This is the instance for running the sample applications that ship with ColdFusion. To create a new instance you simply select Add New Instance. This will bring up the new instance dialog. In the server name type 'sites' (exactly as you named the folder, including case) and select the Create Windows Service option, then hit Submit. That's it! ColdFusion automatically goes through a four step process to create your new instance, giving you status updates along the way. Once it's complete, go back to the Instance Manager and do the same thing for your 'control' instance.

OK, so you have new instances, but Apache still can't talk to them yet. We need to do a little more work on the Apache config before we can really start to play. The first thing you'll need to do here is locate the JRun Settings block in your httpd.conf. It'll look very similar to this:

view plain print about
1# JRun Settings
2LoadModule jrun_module "C:/JRun4/lib/wsconfig/1/mod_jrun20.so"
3<IfModule mod_jrun20.c>
4    JRunConfig Verbose false
5    JRunConfig Apialloc false
6    JRunConfig Ssl false
7    JRunConfig Ignoresuffixmap false
8    JRunConfig Serverstore "C:/JRun4/lib/wsconfig/1/jrunserver.store"
9    JRunConfig Bootstrap 127.0.0.1:51000
10    #JRunConfig Errorurl <optionally redirect to this URL on errors>
11    #JRunConfig ProxyRetryInterval <number of seconds to wait before trying to reconnect to unreachable clustered server>
12    #JRunConfig ConnectTimeout 15
13    #JRunConfig RecvTimeout 300
14    #JRunConfig SendTimeout 15
15    AddHandler jrun-handler .jsp .jws .cfm .cfml .cfc .cfr .cfswf
16</IfModule>

Alright, some major points to notice here. Two big lines to look at for multiserver configuration stuff, the Serverstore and the Bootstrap. These will be different for each instance of ColdFusion. You probably already recognize most of the path in the Serverstore value. The 'control' and 'sites' instance folders that you had created will replace the 1 in your new definitions. The Bootstrap value comes from each instance's port setting in it's JRunProxyService. To get this value, go to that instance's jrun.xml file, located at C:\JRun4\servers\[instance name]\SERVER-INF\jrun.xml. Open this file and find the following service definition block:

view plain print about
1<service class="jrun.servlet.jrpp.JRunProxyService" name="ProxyService">
2    <attribute name="activeHandlerThreads">25</attribute>
3    <attribute name="backlog">500</attribute>
4    <attribute name="deactivated">false</attribute>
5    <attribute name="interface">*</attribute>
6    <attribute name="maxHandlerThreads">1000</attribute>
7    <attribute name="minHandlerThreads">1</attribute>
8    <attribute name="port">51002</attribute>
9    ....

Two things you need here. First, make sure that the deactivated attribute is set to false. Next, write down the port value. So, if you are in the jrun.xml of your 'control' instance, and the port is '51020', then write that down (control: 51020) and do the same for your 'sites' instance. Also remember that you will need to restart these instances after changing the deactivated attribute.

Next, let's break out the default ColdFusion instance specific information and place it inside it's own include config file. In your Apache conf directory, create a new file - cf_defaultinstance.conf. In this file we'll place those settings we want for our default instance:

view plain print about
1<IfModule mod_jrun20.c>
2    JRunConfig Verbose false
3    JRunConfig Ignoresuffixmap false
4    JRunConfig Serverstore "C:/JRun4/lib/wsconfig/1/jrunserver.store"
5    JRunConfig Bootstrap 127.0.0.1:51000
6</IfModule>

With these settings now within their own include, we can now remove them from the httpd.conf file:

view plain print about
1# JRun Settings
2LoadModule jrun_module "C:/JRun4/lib/wsconfig/1/mod_jrun20.so"
3<IfModule mod_jrun20.c>
4    JRunConfig Verbose false
5    JRunConfig Apialloc false
6    JRunConfig Ssl false
7    JRunConfig Ignoresuffixmap false
8    #JRunConfig Serverstore "C:/JRun4/lib/wsconfig/1/jrunserver.store"
9    #JRunConfig Bootstrap 127.0.0.1:51000
10    #JRunConfig Errorurl <optionally redirect to this URL on errors>
11    #JRunConfig ProxyRetryInterval <number of seconds to wait before trying to reconnect to unreachable clustered server>
12    #JRunConfig ConnectTimeout 15
13    #JRunConfig RecvTimeout 300
14    #JRunConfig SendTimeout 15
15    AddHandler jrun-handler .jsp .jws .cfm .cfml .cfc .cfr .cfswf
16</IfModule>

Notice that I just commented them out. You can remove the entirely if you like, but I'm gonna leave it. Next I'm going to adjust my VirtualHost for my default instance administrator access:

view plain print about
1#
2# Use name-based virtual hosting.
3#
4NameVirtualHost 127.0.0.1:80
5
6    ....
7
8<VirtualHost 127.0.0.1:80>
9    ServerAdmin username@companyname.com
10    # Root folder for my scratchpad stuff
11    DocumentRoot "C:\Documents and Settings\username\My Documents\wwwroot"
12    ServerName username.companyname.loc
13    # Alias for /CFIDE, which the CF install placed in my Apache webroot.
14    # This is solely for our dev environment, and would not be a good practice
15    # within a production environment
16    Alias /CFIDE "C:/Program Files/Apache Group/Apache2/htdocs/CFIDE"
17    <Directory "C:/Program Files/Apache Group/Apache2/htdocs/CFIDE">
18        AllowOverride All
19        Order allow,deny
20        Allow from all
21    </Directory>
22    ErrorLog logs/username.companyname.loc-error.log
23    CustomLog logs/username.companyname.loc-access.log common
24    # SGB [072007]: Add include for default ColdFusion instance
25    Include conf/cf_defaultinstance.conf
26</VirtualHost>

Now username.companyname.loc is setup to use the default ColdFusion instance. Next, setup an include for your 'control' instance. In Apache's conf directory, create another config file - cf_controlinstance.conf. Remember those port numbers you wrote down from the jrun.xml files? It's in the Bootstrap:

view plain print about
1<IfModule mod_jrun20.c>
2    JRunConfig Verbose false
3    JRunConfig Ignoresuffixmap false
4    JRunConfig Serverstore "C:/JRun4/lib/wsconfig/control/jrunserver.store"
5    JRunConfig Bootstrap 127.0.0.1:51020
6</IfModule>

Then you could define a special domain for accessing the 'control' instance's ColdFusion Administrator, by adding another VirtualHost directive to the Apache config:

view plain print about
1<VirtualHost 127.0.0.1:80>
2    ServerAdmin username@companyname.com
3    # Root folder for a 'control' instance
4    DocumentRoot "C:\Documents and Settings\username\My Documents\wwwroot\admin"
5    ServerName control.companyname.loc
6    # Alias for /CFIDE, each CF instance has it's own CFIDE.
7    # This is solely for our dev environment, and would not be a good practice
8    # within a production environment
9    Alias /CFIDE "C:/JRun4/servers/control/cfusion.ear/cfusion.war/CFIDE"
10    <Directory "C:/JRun4/servers/control/cfusion.ear/cfusion.war/CFIDE">
11        AllowOverride All
12        Order allow,deny
13        Allow from all
14    </Directory>
15    ErrorLog logs/control.companyname.loc-error.log
16    CustomLog logs/control.companyname.loc-access.log common
17    # SGB [072007]: Add include for the 'control' ColdFusion instance
18    Include conf/cf_controlinstance.conf
19</VirtualHost>

Notice the different path for the CFIDE folder. Each created instance will have a unique CFIDE. Also notice that I changed the DocumentRoot path, to reflect the root of the application I'll use with the instance. Now that you've setup your 'control' instance, config, and VirtualHost, you can do the same thing for your 'sites' instance. Just watch your port value, Serverstore path, and CFIDE and DocumentRoot paths.

Local Development Setup Pt 1: Apache and ColdFusion (7 or 8)

OK, so now that we are moving to localized development it has become necessary to learn how to configure our desktops for running a localized copy of our sites. The trick here is the complexity of our setup. We have multiple sites, sharing the same code base, accessing media from external resources. We also have to setup for multi-instance, to separate our front-end site and our back-end administration. We also need to be setup for one SSL site. Oh yeah, and we have to do it on Apache, since XP pro's version of IIS has some limitations about running multiple domains simultaneously. To top it all off (at least in my case) I also have to run Apache 2.0.59 so that I can also run Subversion, which is not yet compatible with Apache 2.2.

[More]

Release Day Is Upon Us

So, the flood has begun. Model Glue: ColdFusion (also known as 2.0, formerly Unity), as well as Model Glue: Flex, have been released, along with a brand new website.

[More]

The ColdFusion 8 AJAX Components Debate

A debate rages on across the ColdFusion development community about the inclusion, and use, of the AJAX driven components and accompanying tags that have been included in the Beta Release of ColdFusion 8. Many examples of their use and benefit have already been posted by the likes of Ray Camden, Ben Nadel, and Ben Forta. No surprise there, as they all are huge proponents of the product, and, like so many of us, are very excited about the upcoming release of our favorite web programming platform.

But there are others still that think that the inclusion of these tags and components don't necessarily belong in the core language set of CFML. Many of these folks are also diehard JavaScripters, who took up writing AJAX early in it's infancy, fashioned their own components, or even contribute to open source libraries like JQuery. They argue that maybe the tags should have been separate CFCs available through the Adobe Developer's Exchange, or that the JavaScript rendered by the ColdFusion engine is too fat, taking up unnecessary bandwidth.

Can't we all just get along?

[More]

Previous Entries