Creating an Ajax mySQL Full Text Search in cfWheels

I’m quite a fan of mySQL full text search – don’t get me wrong, it’s not exactly Google, and it has some annoying limitations, but for a quick and easy search function, it’s not half bad. It’s particularly useful for intranets, or just listings in admin section when you might have to delve through a few thousand records otherwise.

This example uses cfWheels, and most importantly, the cfWheels plugin Remote Form Helper.

Step One: Set up an index on your Table in MySQL.

Note: you can only do this on myISAM tables, not innodb.
I usually do this is navicat, where it’s very painless to do indeed. (simply right click a table, select Design table, then select indexes: give the index a name, and select the columns in the table you want to search: make sure the index type is set to fulltext).

For the ‘navicat-less’, try something like this:

CREATE TABLE `articles` 
(
  `id` int(11) NOT NULL default '0',
  `title` varchar(125) default NULL,
  `topic` varchar(25) NOT NULL default '',
  `author` varchar(25) NOT NULL default '',
  `createdAt` datetime NOT NULL default '0000-00-00 00:00:00',
  `body` text NOT NULL,
);

Then add the index:

ALTER TABLE articles ADD FULLTEXT(title,body);

The attributes in the Fulltext() function are the column names we wish to search (I believe there’s a limit of 16 columns).

In your mySQL conf file, add this line under the [mysqld] section:
ft_min_word_len=3

This sets mySQL to search for words of 3 characters or more, rather than the annoying default of 4.

Step 2: cfWheels setup
Download and install/activate/configure the Remote Form Helper plugin.
Make sure you remember to add the line:

addFormat(extension="js", mimeType="text/javascript");

To your config/settings.cfm file as mentioned in the plugin documentation, and also include the wheel.jquery.js file distributed with the plugin. (oh and jQuery itself, obviously).

Step 3: Search Views & Controllers

We need to create at least three files.

Firstly, our controller, /controllers/Search.cfc:

<cfcomponent extends="controller">
<cffunction name="init">
<cfscript>
// This provides bit is essential!
 provides("html, json, js"); 
</cfscript>
</cffunction>

<cffunction name="q" hint="The Main Search Router">
<cfscript>
// If search terms is incoming, and is an ajax request
if(structkeyexists(params, 's') AND len(params.s) GTE 3 AND isAjax())
{   
// searchArticles() function returns our HTML directly to the data attribute 
renderWith(data=searchArticles(params.s), template="articles");
}
else
{ renderNothing() };
</cfscript> 
</cffunction>

<cffunction name="searchArticles" access="private" hint="Our FullText Search">
 <cfargument name="s" required="yes" type="string" hint="The Search Term">
 <cfquery name="q" datasource="#application.wheels.datasourcename#" maxrows="50">
 SELECT id, title, topic, author, body, createdAt FROM articles
 WHERE MATCH (title,body) AGAINST (<cfqueryparam cfsqltype="cf_sql_varchar" value="#arguments.s#">);
 </cfquery>
 <cfloop query="q">
  <cfset q.content[currentrow]=formatResult(string=content, highlightT=arguments.s)>
 </cfloop>
 <cfreturn q />
</cffunction>

<cffunction name="formatResult" access="private" returntype="string" hint="Returns a shortened teaser of the main content, with highlighted search terms">
 <cfargument name="string" type="string" hint="The String to truncate">
 <cfargument name="highlightT" type="string" hint="The term to highlight">
  <cfset var newString="">
  <cfset newString=highlight(text=truncate(stripTags(arguments.string), 400), phrases=arguments.highlightT)>
  <cfreturn newString />
</cffunction>

Next, some view files:

/views/search/index.cfm (where our search form and results output will live)

<cfparam name="params.s" default="">       
<cfoutput> 
<!---Form--->
<!--- NOTE THE remote=true attribute!--->
#startFormTag(controller="search", action="q", id="advancedSearchForm", remote="true")#
#textFieldTag(name="s", placeholder="Search..", label="Search For", value=params.s)#
#submitTag(value="Search")#
#endFormTag()#
<!---Output--->
<div id="results"></div> 
</cfoutput> 

/views/search/articles.js.cfm – This formats our returned dataset for return to the page

<!--- loop over and save the output in a variable--->
<cfsavecontent variable="resultSet">
<cfoutput>
<h2>Articles which match your search:  (#arguments.data.recordcount# results)</h2>
<cfloop query="arguments.data">
<cfoutput>
  <h3>#linkTo(text=title, controller="articles", action="view", key=id)#</h3>
  <h4>#Dateformat(createdAt, 'dd mmm yyyy')#</h4>
  <p>#content#</p>
</cfoutput>
</cfloop>
</cfsavecontent>

<!--- Use the plugin to shove back the results to the page --->
<cfoutput>
#pageReplaceHTML(selector="##results", content=resultSet)#
</cfoutput>

That’s it!
So the form at /views/search/index.cfm posts via ajax to the ‘q’ function in /controllers/Search.cfc.
That q() function returns the html via renderWith(), and the remote form helpers pageReplaceHTML() function posts the results to the results div on the calling page.

The nice part about this approach is that you can use that q() part with a switch/case statement to call any function to return data as you need. You could even go to a different search source, such as a Google Search Appliance, which I’ll cover soon.

Railo 3 Beginners Guide Review

Railo 3 Beginners Guide Cover

The Railo 3 Beginner’s Guide is a book released late last year which I’ve just managed to get my hands on thanks to Packt publishing. The core authors include some Railo team members - Mark Drew, Gert Franz and ardent Railo community members Paul Klinkenberg & Jordan Michaels – so there’s no better source here – straight from the horse’s mouth!

Whilst this one is a ‘beginner’s’ guide, it does cover a fairly wide range of topics for developers – they are unarguably ‘Railo-centric’ – but that’s sort of the point.

Chapter 1: Introducing Railo Server
Chapter 2: Installing Railo Server
Chapter 3: CFML Language
Chapter 4: Railo Server Administration
Chapter 5: Developing Applications with Railo Server
Chapter 6: Advanced CFML Functionality
Chapter 7: Multimedia and AJAX
Chapter 8: Resources and Mappings
Chapter 9: Extending Railo Server
Chapter 10: Creating a Video-sharing Application

Generally speaking, I find that books like this have a challenging scope: do you try and fit in everything you know for CFML development? Where do you start/stop? It’s a tricky balance – I found that there were some quite big leaps from absolute basics, like looping & outputting, through to storing information in the ram cache, using ORM and Amazon S3. Don’t get me wrong, these topics are important, but a fair leap away if you’re assuming the user can’t output a loop.

All that said, the examples given are clear and well explained, and certainly give a taste of a Railo outlook on CFML development. Hopefully there will be an advanced series too, which delves down into JVM tuning and garbage collection in details amongst other things.

As far a Railo resource, it’s undoubtedly useful – in fact I would say it’s ideal with a CF developer with ‘some’ experience, who wishes to use Railo-tastic features, and is moving from ACF. However, it’s not something I’d recommend to those who are picking up CF for the first time, as you just can’t fit the entire CFML programming fundamentals into a single book – in my opinion anyway!

Nested Layouts in CFWheels

Must confess I’ve been struggling with this one today, even with the great documentation at cfwheels.org.

Thanks to the cfwheels list, I’ve cracked it though (blogging the solution for posterity!).

I wanted to created a truly nested layout, i.e a master layout with <html> & <body> tags, and then a sub layout in my /views/main/ (with main being the controller name in this instance) which then included the default content on /views/main/index/. The problem was that when you add a layout file in /views/main/layout.cfm it overrides the default layout in /views/layout.cfm.

Turns out you need to call the parent template in the sub template and inject the content using contentFor() before you make the call.

So in my /views/layout.cfm

<html>
<body>
<div id="master-wrapper"> 
<cfoutput>#includeContent("mainbody")#</cfoutput>
</div>
</body>
</html>

and in my /views/main/layout.cfm

<cfoutput>  
<cfsavecontent variable="mainbody">
<div id="controller-layout">
#includeContent()#
</div>
</cfsavecontent>
<!--- Inject the above variable into the parent layout --->
<cfset contentFor(mainbody=mainbody)>
<!--- Include the parent layout --->
#includeLayout("/layout.cfm")#
</cfoutput>

and in my standard view of /views/main/index.cfm

<div id="main-view">
<p>I am the content in the main index file</p>
</div>

Results in…

<!-- Master Layout Start -->
<html>
<body>
<div id="master-wrapper"> 
  <!-- Sub Layout Start -->
  <div id="controller-layout">
    <!-- Standard View Start -->
    <div id="main-view">
    <p>I am the content in the main index file</p>
    </div>
    <!-- Standard View End -->
  </div>
  <!-- Sub Layout End -->
</div>
</body>
</html>
<!-- Master Layout End --> 

Notes: this means you’d need a layout.cfm in every views folder just to set the mainbody variable, otherwise your content never gets called by the top level layout. I’ve tried getting this working with the default body variable but haven’t had much success yet.

Changing a few things!

Sorry to confuse everyone, but I’ve been changing some things… please bear with me whilst I move over posts/comments etc from the old blog. I’ll be adding redirects soon.

I decided to move over WordPress from Mango Blog – that may make the more staunch CF developers in the world shudder, but Mango Blog just wasn’t keeping up with the times for me :)

Besides, the right tool for the right job eh?

cfWheels – Active Directory / LDAP authentication

Adding login and administrative features to your cfWheels apps has cropped up on the mailing list a few times, so I thought I’d just pull together a simple example for Active Directory / LDAP authentication.

In this particular snippet, I want to establish a user a) has credentials on the server, and b) belongs to a group called ‘ContactsDatabaseUsers’.

This method has the benefit of a) not having to store user’s passwords in your application database and b) allowing system administrators to control access to the application.

This example assumes you have a login form at /main/login/ which posts params.username and params.password to /main/dologin/

I also would have a local users model where I could add additional application specific options: i.e, when they last logged in, etc.

This information would then be copied to the session scope. ie. session.currentuser

When the users first login, this local entry needs to be created, or copied to the session scope from the existing entry.

<--------Main.cfc----------->
<cffunction name="init">
<cfscript>
// AD Auth, application wide: if I wanted to only restrict a subset a pages, I could use 'only' instead of except below.
// This filter call should be in every controller you wish to protect
filters(through="loginRequired", except="login,dologin");	
// Login
verifies(only="dologin", post=true, params="username,password", error="Both username and password are required", handler="login");	
</cfscript>
</cffunction>

<cffunction name="dologin" hint="Logs in a user via LDAP"> 
  <cfset var ldap=QueryNew("")/>
  <!--- The LDAP Server to authenticate against--->
  <cfset var server="ldap.server.domain.com">
  <!--- The start point in the LDAP tree --->
  <cfset var start="OU=People,DC=domain,DC=com">
  <!--- Scope --->
  <cfset var scope="subtree">
  <!---- Name of group --->
  <cfset var cn = "ContactsDatabaseUsers">
               <cftry>
                   <!--- Check the user's credentials against Windows Active Directory via LDAP--->
                   <cfldap
                       server="#server#" 
                       username="DOMAIN\#params.username#" 
                       password="#params.password#" 
                       action="query" 
                       name="ldap" 
                       start="#start#" 
                       scope="#scope#" 
                       attributes="*" 
                       filter="(&(objectclass=*)(sAMAccountName=#params.username#))"/>
                      	
                       <!--- Get the memberOf result and look for contactsDatabase---> 
                       <cfquery dbtype="query" name="q">
                       SELECT * FROM ldap WHERE name = <cfqueryparam cfsqltype="cf_sql_varchar" value="memberOf">
                       </cfquery>
                                             
                       <cfscript>
		if (q.value CONTAINS "CN=#cn#")
			{
			params.name=params.username;
			// Check for the local user profile
			if( model("user").exists(where="username = '#params.username#'"))
			{
				// User found, copy user object to session scope
				user=model("user").findOne(where="username = '#params.username#'"); 
				user.loggedinAt=now();
				user.save();
				session.currentuser=user;                              
				flashInsert(success="Welcome back!");
			}
			else
			{
				// No Local Entry, create user as LDAP Auth has been successful
				user=model("user").create(username=params.username, name=params.name, loggedinAt=now());
				if (NOT user.hasErrors()){
					user.save();         
					session.currentuser=user;							
					flashInsert(success="Welcome - as this is the first time you've logged in, your account has been created.");
				}
				else
				{
					flashInsert(error="Error creating local account from successful Active Directory authentication");
					StructClear(Session);
					renderPage(action="login");
				}
			}
			redirectTo(route="home");
		}
		else 
		{
		 redirectTo(route="home");
		 }						
      </cfscript> 

   <cfcatch type="any"> 
      <cfscript>
           if(FindNoCase("Authentication failed", CFCATCH.message)){
            // Likely to be user error, e.g. typo, wrong username or password
             flashInsert(error="Login Failed: please check your username and password combination");
             renderPage(action="login");
            }
            else {
            // Likely to be LDAP error, e.g. not working, wrong parameters/attributes
            flashInsert(error="Login Failed: Please try again.");
            renderPage(action="login");
            }
       </cfscript> 
   </cfcatch>
 </cftry>                       
</cffunction>
</cfcomponent>

<-----------Controller.cfc------------->
<cffunction name="loginRequired" hint="This function will prevent non-logged in users from accessing specific actions">
<cfif NOT isloggedin()> 
	<cfset redirectTo(controller="main", action="login")>
</cfif>
</cffunction> 
    
<cffunction name="isloggedin" hint="Checks for existence of session struct">
<cfif NOT StructKeyExists(session, "currentUser")> 
   <cfreturn false>
<cfelse>
   <cfreturn true>
</cfif>
</cffunction>

Don’t forget nested functions in CFWheels

Occasionally I forget clever little things, like using functions as argument values.

Here’s a small snippet which I ended up using when doing a blog front page, where I wanted to strip the tags out of the blog body, truncate it to 300 characters, and then add a ‘read more’ link at the end of the paragraph with an ellipsis.

There’s a load of nice little functions in CFWheels for this – stripTags(), truncate() and linkTo().

Here’s what the helper in the loop ended up as:

#truncate(
  text=stripTags(body), 
  length=300, 
  truncateString="#linkTo(
      text='... Read More &raquo;', 
      route='blog', 
      action='view', 
      key=id)#"
)#

Password Hashing and Salting

As I’m (hopefully) going to release my first open source project soon, I thought it would be a good time to revisit application security, specifically password hashing, salting and general encryption. If you’ve not come across Jason Dean’s blog, I’d recommend that as a first port of call, as most of what I’m referring to is explained there.

Password security is one of those never ending journeys; you have to assume that it’ll never be perfect, and that there’s always room for improvement. I’d recently had an idea to improve password hashing a little, which someone might find useful in principle.

Let’s take a simple password, like ‘myPassword1′ (incidentally, if you actually have any passwords like that, please go and get lastpass or password1 immediately and wash your keyboard out with soap).

‘myPassword1′ stored in a database in plain text is obviously a very, very bad idea. Most people’s gut reaction would be to hash it, which at least is one way.

<cfset thePassword='myPassword1'>
<cfset passwordHash=Hash(thePassword, 'SHA-512') />

But wait!, let’s assume 90% of our users are stupid, or at least, uninformed. If two people have the same password, their hashes will appear identically in the password column. On top of that, if I hash the password again, I’ll get exactly the same resultant string. So a hashed (unsalted) password is really not that useful. You could perform a ‘rainbow table’ attack: password hashes from your database can be compared against a table of known hashes (such the hash for myPassword1), et voila, password revealed.

So the next step is to introduce salting: that is, appending or prepending a unique string to the password to make the resultant hash unique, and make rainbow table attacks much much harder.

<cfset salt=createUUID() />
<cfset passwordHash=Hash(thePassword & salt, 'SHA-512') />

This has the advantage that attacks require the salt to even attempt some sort of dictionary/rainbow table attack. Usually the salt is stored alongside the hashed and salted password in the database.

This is the part which I think could be improved by a relatively small step. By storing the salt alongside the password in the users table, we’re essentially giving a hacker the necessary ammunition to attempt to compromise the password. Let’s make it a little harder. Let’s encrypt the salt itself using another key, stored outside the webroot. So I might have a folder called ‘auth’ on the same level as my ‘webroot’ or ‘html’ folder. In there, I’m going to store a text file with a UUID inside. When I want to compare a password hash, I now have to decrypt the salt before I can use it (as I acutally need the salt value, I’m not hashing it, as that’s one way) using the key read in via CFFILE outside the webroot.

<!---Get the key--->
<cfset authKeyLocation=expandpath('../auth/key.txt')>
<cffile action="read" file="#authKeyLocation#" variable="authkey">
<!--- New password hashing --->
<!--- Generate a salt, this is never stored in it's plain form---> 
<cfset theSalt=createUUID() />
<!--- Hash the password with the salt in it's plain form---> 
<cfset passwordHash=Hash(thePassword & theSalt, 'SHA-512') /> 
<!--- The encrypted salt to store in the database, 
using the authKey--->
<cfset salt=encrypt(theSalt, authKey, 'CFMX_COMPAT')>

So why do this? If your database is compromised, you at least have an additional key to the puzzle missing for the attacker: they would have to do both a compromise of the filesystem AND the database to get anywhere. Additionally, you could quickly invalidate all the passwords in your application by deleting/regenerating the ‘master key’ at a file system level (this also assumes you have a half decent ‘forgotten/reset password’ feature in your application), which could be useful is you suspect the database and file system to be compromised at any point.

Whilst this is hardly comprehensive explanation, by adding additional layers and parts to the process, we’ll hopefully make any attackers lives a little more difficult. Jason Dean adds another step I like looping the hashing process over 1000 times – if you made this number completely arbitary between 1000 and 1500, you’d have another variable the attacker would have to get, and this one would be stored in the source code of the application itself.

So assuming you added all the above measures, you’d have to:

  1. compromise the database (for the password hash, and the encrypted salt),
  2. the source code for the application (to get the number of hashing loops), and
  3. get the facility to navigate outside the (potentially) locked webroot (to get the master key to decrypt the encrypted salt).

Mucking around with numbers

So, I was reading up on the Fibonacci sequence the other day, and wondered how long it would take CF to calculate the sequence to a couple of thousand steps, and also how long those numbers ended up becoming.

Consider the following loop:

<cfset n1 =0>
<cfset n2 =1>
<table>
<thead>
<tr><th>Row</th><th>No.</th><th>Length</th>
</thead>
<tbody>
<cfloop from="1" to="1000" index="i">
<cfset x = (n1 + n2)>
<cfoutput>
<tr><td>#i#</td><td>#len(numberformat(x))#</td><td>#numberformat(x)#</td></tr>
</cfoutput>
<cfset n1 = n2>
<cfset n2 = x>
</cfloop>
</tbody>
</table>

I immediately hit a few issues. One was scientific notation, which was solved by numberformat(), but the other issue I found was that if I tried to increase the number of loop iterations beyond 1475, I’d get an error simply stating ‘For input string: “I”‘

Removing the numberformat mask revealed 1.30698922376E+308 as the largest generated number in the sequence, with the remaining rows represented as ’1.#INF’.

So, out of curiosity, does anyone know why? Is it simply that Java can’t handle larger integers?