Today In nodeJS

The Short Comings of Sphinx GEO Search

When I first got Sphinx installed I was thrilled. I had found a simple easy solution to all of my problems. I started churning out indexes left and right. One day my boss asked me “Hey can we put some geo distance information on here?” … “No problem!”. I had been reading the docs and saw how easy it was to add in geo distance and to sort by geo distance and to limit within a geo distance. Heck it was even easy to limit with in a range of distance(x with in 40 to 50 miles). Needless to say I added the columns added the 3 lines of code and poof there it was. What used to take stupid mathematical queries is poof done in 3 lines of code. At this point I was pounding the Sphinx Kool Aid, and boy did it tasted great.

A couple days later my boss pointed out to me, that we didn’t have a 1:1 relationship between our products and their locations, we had products distributed throughout the United States, and that my solution while correct(we had a featured location) only worked on a 1:1 correlation and so could i please implement a 1:N relationship. Now completely drunk on the Kool Aid my response “OF COURSE!”.
I proudly opened up my fully customized conf file and set about to change the latitude and longitude columns to be Multi-valued attributes . But before I could press the first key I realized that Latitude and Longitude need to be compared as ordered pairs, if I made them mva’s they’d be compared as n:n of latitude to longitude comparisons(I didn’t try this I know a bad idea when it see it, sometimes).

I then began to do some thinking, and then some reading, and then some more thinking and more reading. I came to the realization that addresses must be stored in their own index, not a problem I quickly create my new index. Knowing that I could run multiple queries on different indexes at the same time, I wrote up a quick sphinx query and executed. My jaw nearly hit the floor, there was a storm trooper telling me “These aren’t the results I was looking for”. I did some quick reading and find out that it is impossible in a multi query to pass results from the first query to the second query with in sphinx. Down hearted but still convinced this is possible with Sphinx, I decided I can hack it.

And Hack It I Did. I would run two queries, one would limit our products by location, the second would limit actual products by product specific constraints and I would pass in the product ids using php. This looked pretty damn ugly, slightly better than sql solution, but it was working. I had to tweak the memory usage up on the search daemon to accommodate for the larger data set, but all in all it seemed to work fast the code was straight forward and everything was in one area Sphinx. I began to fill up my cup with kool aid. A week had passed, now was the big moment the push live. We pushed live and instantaneously I saw memory issue errors over flow my error log. I said not a problem, the live server has more ram than dev, I’ll throw some machine at it. Then I saw the PHP memory errors coming. And that’s when the KoolAid went bad. As it was late, the push live had been backed up for over a month now, I patched code to stop the error messages at the expense of quality data and I started looking in svn for the old trusted distance formulas.

In this specific instance I end up using straight SQL, there were a couple joins, but performance was not horrible with our traffic load. In other instances where I needed the full text search, I did the same distance constraints and passing the specific role ids into sphinx. But there was always that pain in my neck, the jabbing in my side, knowing that this code should be cleaner.

Ruby On Rails Compared to PHP

Note: This is not in any way a code or performance comparison.

First and Foremost lets start out with this is a horrible comparison. Ruby on Rails
is a Web Framework, PHP is a programming language. If you wanted a comparison between
the two PHP would win, it is a fully developed language. The comparison people
should be making is Ruby vs Akelosa, ash.MVC, CakePHP, CodeIgniter, DIY, Ez Components,
FuseBox, PHP On Trax(Ruby on Rail Clone), PHPDevShell, PhpOpenbiz, Prado, QPHP, Seagul,
Symfony, WACT, WASP, Yii, Zend, and Zoop.

As we can see from this short list there are a bunch of PHP Web Frameworks and chances are some of these come close to Ruby On Rails, and some may be on par. They all lack one thing full community support. You have your CakePHP programmers, if they’re still left, Symfony programmers and Zend programmers. Ruby has only one web framework, or only one web framework that matters. This gives Ruby On Rails a far greater community than any PHP Framework. Every web developer who uses Ruby commits to Rails, they all submit patches and addons to Rails exclusively. The PHP side is divided, most people submitting patches to their own specific framework they love, or better yet writing their own(I have). If all of these developers committed to one Framework I am sure that there would be no comparison between Ruby on Rails and Chose PHP Framework.

Why would their be no comparison? Rails enthusiasts claim supremacy due to all in one framework, PHP On Trax has basically shown that you can copy the Rails suite to PHP. The next big RoR is Active Record, which has basically been implemented in PHP by Doctrine. But more interestingly Twitter Dev Alex Payne talks about scaling issues surrounding active record(older version of rails). PHP has implemented all of the “benefits” of Rails but through different projects. If they were all under one umbrella it would surpass Rails as the Complete Web Developer Framework.

JQuery UI Tabs Bug In IE 7 /8

Depending on the version of JQuery you’re running, included plugins/scripts, and how you retrieve JQuery, you may have noticed that JQuery’s UI Tabs is broken. JQuery should be aware of the problem as it has been posted to their forum.

This line of code has been provided by one of the users as a reason for the error appear in IE 7/8. The thread appears to resolve mainly around MicrosoftAjax.js issues. This same issue happens if you use Google Ajax Apis. If you look at the javascript code executed by the jsapi from google you will see they’re storing their loader in the window object as well.

This appears to be an issue involving the window object in JQuery UI Tabs. This issue only affects version 1.4.2 of JQuery, if you change to 1.4.1 you’ll be fine.

You may consider not using Google Ajax Apis, in that case replace your google.load code with the following:

Object Caching PHP

JLogs is the name of the MVC Framework that powers HammeredForums. Its power is its simplicity. It uses straight SQL instead of the automatic joins of an ORM(Object Relational Mapper). One of the flaws within the JLogs architecture is its object loading. In a message board thread one user can post a number of times. This means the author’s “user” object will be loaded one time for each post.
The solution to this is object caching. The basic principle is store a hash of all objects that JLogs has loaded. If the object exists in the objectLibrary a pointer to that object is returned. If the object does not exist, it is added to the library and the pointer to it is returned.
The requirements to create this system are basic: a way to create a unique identifier for each object and and a place to store each object where it can be retrieved rapidly. The second part is simple, dictionary, hash table, Jlogs uses a relational array. Creating the key is tricky. JLog uses a combination of keys for each object allowing more potential look up values. JLog parses the database schema recording column types, primary keys, and unique keys of each table, serializing the data and storing it in a file for a given life cycle. The Jlogs basic lookup consists of find($conditions, $limit, $order). $conditions are a key value pair of database columns and values. JLogs parses the $conditions trying to create a primary key or unique key. If their are conditions enough to create a primary or unique key they are hashed and looked up in the objectLibrary. The hashing algorithm is rather simple, but robust enough to support composite primary keys. Given a table with the primary key `id` and the value of 124, a relational array is created array(‘id’=>124). This array is then serialized to create the hash “a:1:{s:2:”id”;i:124;}”. This is the lookup hash for the objectLibrary.

If an object is not found in the library the query goes off as normal. The returned object is then parsed for all of its primary keys and unique keys. These are all serialized to separate look up hashes. Given the same table before with the primary key id and value of 124, pretend it has a unique key email with the value of dev@jlogs.com. Jlogs will create two hash keys, “a:1:{s:2:”id”;i:124;}” and “a:1:{s:5:”email”;s:12:”dev@logs.com”;}”. Both of these keys will be added to the objectLibrary. This gives us an extra “hit” in our objectLibrary, the primary key as well as the unique key `email`.

Here is a dump from my general query log on some development data, this is nothing amazing but just a view of what I am talking about.

As you can see here the same exact query ran 3 more times than necessary with out object caching.
You can see the important functions in JLogs Object Caching code, or feel free to the file in entirety.

<?php
   // file: model/user.php
   public static function find_first($data, $order = '') {
	return parent::find_first($data, $order);
   }

// File: model/base.php (parent of user)
public static function find_first($data, $order) {
	$x = self::find($data, $order, 1);
	return is_array($x) ? array_shift($x) : $x;
}
    
public static function find($conditions = array(), $order = '', $limit) {
	$order = ($order) ? ' ORDER BY ' . $order : '';
	$limit = ($limit) ? ' LIMIT ' . $limit : '';
		
	// General Good To haves
	$class = self::get_base_class();
	if (!self::$schema[$class]) {
		// Retrieve Table Info from cache.
		self::$schema = self::retrieve_schema();
	}

	switch (self::find_by_unique($conditions, self::$schema[$class])) {
		case self::PRIMARY_KEY:
			if ($object = self::fromPrimaryKey($conditions, self::$schema[$class], $class)) {
				return $object;
			}
			break;
		/* not yet implemented
		case self::UNIQUE_KEY:
			if ($object = self::fromUniqueKey($conditions, self::$schema[$class], $class)) {
				return $object;
			}
		*/
	}
	
	$table = self::$schema[$class];
	$query = 'SELECT * FROM `'.$class.'` %s' . $order . $limit;
	
	if (is_numeric($conditions) || filter_var($conditions, FILTER_VALIDATE_INT) !== false) {			
		// handle where
		$constraints = $variables = array();
		// Create Bindings
		foreach($table[self::PRIMARY_KEY] as $idx=>$key) {
			$constraints[] 	= '`'.$key.'` = :pk_'.$idx;
			$variables[] 	= array(':pk_'.$idx, $conditions, $table[$key]['bind']);
		}

		// Create Query
		$query = sprintf($query, 'WHERE ' . JOIN(' AND ', $constraints));
		$dbh = dbConn::get_connection();
		$stmt = $dbh->prepare($query);
	    
		// Do Bindings(they have to goto a stmt :( )
		foreach($variables as $variable) {
			list($parameter, $value, $bind_type) = $variable;
			$stmt->bindValue($parameter, $value, $bind_type);
		}
		
	} elseif (is_array($conditions) && count($conditions)) {
		$constraints = $variables = array();
		foreach ($conditions as $cond => $value) {
			$constraints[] 	= '`'.$cond.'` = :cons_'.$cond;
			$variables[]	= array(':cons_'.$cond, $value, $table[$cond]['bind']);
		}
		
		// Create Query
		$query = sprintf($query, 'WHERE ' . JOIN(' AND ', $constraints));
		$dbh = dbConn::get_connection();
		$stmt = $dbh->prepare($query);
	    
		// Do Bindings(they have to goto a stmt :( )
		foreach($variables as $variable) {
			list($parameter, $value, $bind_type) = $variable;
			$stmt->bindValue($parameter, $value, $bind_type);
		}
	}
	
	$stmt->execute();
	$results = array();
	foreach($stmt->fetchAll(PDO::FETCH_CLASS, $class) as $object) {
		objectLibrary::checkIn($class, $object->get_keys(), $object);
		$object->set_originals();
		$results[] = $object;
	}
	return $results;
}

/**
 * Returns an array of all Unique and Primary Keys
 * for any object
 *
 * @return array
 * @author Logan
 */
private function get_keys() {
	$table 	= self::$schema[get_class($this)];
	$keys 	= array();
	
	foreach (array($table[self::UNIQUE_KEY], $table[self::PRIMARY_KEY]) as $keysList) {
		$to_hash = array();
		foreach($keysList as $column) {
			$to_hash[$column] = $this->$column;
		}
		$keys[] = self::hash_key($to_hash);
	}
	return $keys;		
}

public static function find_by_unique($conditions, $table) {
	// Makes assumption that primary key will test positive if you had 0 as primary key too bad
	if (!$conditions) {
		jlog::qd("No conditions passed in");
	}
	
	// Makes assumption we used ints as primary keys...are we idiots?
	if (is_numeric($conditions) || filter_var($conditions, FILTER_VALIDATE_INT) !== false) {
		return self::PRIMARY_KEY;
	}
	
	
	// Does query contain full primary key
	$primary_key = is_array($table[self::PRIMARY_KEY]);
	if ($primary_key) {
		foreach($table[self::PRIMARY_KEY] as $row) {
			if (!isset($conditions[$row])) {
				$primary_key = false;
				break;
			}
		}
	} 
	
	
	if ($primary_key) {
		return self::PRIMARY_KEY;
	}
	
	// Does table contain unique key
	$unique_key = is_array($table[self::UNIQUE_KEY]);
	if ($unique_key) {
		foreach($table[self::UNIQUE_KEY] as $row) {
			if (!isset($conditions[$row])) {
				$unique_key = false;
				break;
			}
		}	
	}
	
	if ($unique_key) {
		return self::UNIQUE_KEY;
	}
	
	
	return self::NO_KEY;
}private static function retrieve_schema() {
	$cached = false;
	if ( self::schema_cache_length && file_exists(self::schema_file) && self::schema_cache_length > (time() - filemtime(self::schema_file))) {
		$serialized_schema = file_get_contents(self::schema_file);
	    $schema = unserialize($serialized_schema);
		
	} else {
		$schema = self::create_schema_cache();
	}

	return $schema;
}
    
private static function create_schema_cache() {
	$dbh = dbConn::get_connection();
	$schema = array();
	foreach($dbh->query('SHOW TABLES') as $table) {
		$table = array_shift($table);
		$table_data = array('table_name'=>$table);
		foreach($dbh->query('DESC ' . $table) as $row) {
			if (strpos($row['Type'], '(')) {
				$temp      = explode('(', $row['Type']);
				$type      = strtoupper(array_shift($temp));
				$length    = intval(array_shift($temp));
				
				if ($type == 'TINYINT' && $length == 1) {
					// How do I know this a bool?
					$type = 'BOOL';
				}
				
			} else {
				$type = strtoupper($row['Type']);
				switch($type) {
					case 'DATETIME':
						$length	= 'Y-m-d H:i:s';
					break;

					// Other formatting stuff could go here

					default:
	      				$length    = 0;
				}
			}

			if ($row['Key'] == 'PRI') {
				$table_data[self::PRIMARY_KEY][] = $row['Field'];
			}

			#FIXME doesn't support multiple composite unique keys
			# we dont' have one not a big issue
			if ($row['Key'] == 'UNI') {
				$table_data[self::UNIQUE_KEY][] = $row['Field'];
			}

			$table_data[$row['Field']] = array(
				'type'  => self::$type_hash[$type],
				'bind'	=> self::$bind_hash[$type]
			);  

			$table_data[$row['Field']][is_numeric($length) ? 'length' : 'format'] = $length;
		}
		
		$schema[$table_data['table_name']] = $table_data;
	}

	$file_pointer = fopen(self::schema_file.'.tmp', 'w+');
	fwrite($file_pointer, serialize($schema));
	fclose($file_pointer);

	rename(self::schema_file.'.tmp', self::schema_file);
	return $schema;
}

protected static function fromPrimaryKey($conditions, $table) {
	$class = $table['table_name'];
	$num_conditions = $num_pks = 0;
	
	if (!is_array($conditions)) {
		$column = reset($table[self::PRIMARY_KEY]);
		$to_hash = array($column=>$conditions);			
	} elseif ( ($num_conditions = count($conditions)) && ($num_pks = count($table[self::PRIMARY_KEY])) && $num_pks >= $num_conditions) {
		$to_hash = array();
		foreach($table[self::PRIMARY_KEY] as $column) {
			$to_hash[$column] = $conditions[$column];
		}
	}
	
	$hash = self::hash_key($to_hash);		
	$object = objectLibrary::checkOut($class, $hash);

	// Check remaining conditions
	if ($num_conditions > $num_pks) {
		foreach($conditions as $column => $value) {
			if ($object->$column == $value) {
				return false;
			}
		}
	}
	return $object;
}

public static function hash_key($key) {
	return serialize($key);
}

Activating General Query Log MAMP

To activate the general query log for MAMP open up terminal
/Applications/MAMP/bin/startMysql.sh

On the end of the second line add the follow code:
“–general_log=1 –general-log-file=/Applications/MAMP/logs/mysql_general_log”

The first part turns on the general log the second part sets the path to where the queries will be written.

You can view the queries as they run with the following command:
tail – f /Applications/MAMP/logs/mysql_general_log

Open Lightbox From link

When using lightbox as a photo gallery you may want to have the images only appear in your lightbox, not anywhere else on the page. The trick to this is that lightbox is active by clicking an image. This is how to manipulate this configuration to open a lightbox displaying many images on via a text link. This will decrease page load speed because the images will not be retrieved until lightbox requires it. This is based off the JQuery Lightbox Plugin.

The Html:

<a href="#" class="activateLightbox">See Gallery</a>
<div class="gallery" style="display:none;">
    <a href="image1.jpg" class="lightbox" rel="galleryView">&nbsp;</a>
    <a href="image2.jpg" class="lightbox" rel="galleryView">&nbsp;</a>
    <a href="image3.jpg" class="lightbox" rel="galleryView">&nbsp;</a>
    <a href="image4.jpg" class="lightbox" rel="galleryView">&nbsp;</a>
    <a href="image5.jpg" class="lightbox" rel="galleryView">&nbsp;</a>
</div>

None of the links will be displayed because they’re wrapped in a hidden div, and the rel tag identifies the gallery.

The javascript:

$(document).ready(){function(){
    $('a.lightbox').lightbox();
    
    $('a.activateLightbox').click(function(e){
        e.preventDefault();
        $(this).next('.gallery a.lightbox:first').trigger('click');
    });
}};

This is pretty straight forward, the first line binds the Lightbox plugin to the links with class lightbox. The next binds a custom function to the activateLightbox link. It overrides the normal click event. It then finds the next node that matches our criteria for lightbox. You should customize this selector to match the needs of your site. We then trigger a click on the node. Lightbox has been bound to this node already, so lightbox is triggered.

Display Google Map In A Pop Up

If you have ever tried displaying a google map in a pop div, you’ve probably realized that there is something wrong. Google determines the div size when the map is initialized. Since the div is generally hidden at this time it Google cannot accurately determine the size. The solution to this is rather simple, make sure the div is not hidden when map is initialized or resize the map when the div is displayed. We’ll go with the second solution as it is far more elegant. Chances are that your user may open or close the pop up div multiple times. There is no need then to be pingping the maps API multiple times, so we’ll ping once and resize.

$(document).ready(){function(){
    function initialize() {
        var latlng = new google.maps.LatLng(latitude, longitude);
        var map = new google.maps.Map(document.getElementById('map'), {
            zoom : 8,
            center: latlng,
            mapTypeId : google.maps.MapTypeId.ROADMAP
        });
    }
    
    
    initialize();
    
    $('#mapOpen').click(function(){
        $('#mapWrapper').fadeIn( 'slow',  function(){
            google.maps.event.trigger(map, 'resize');
        });
    });
}};

The wrapper div is displayed by the fadeIn function. Upon completion a google maps function call is made that checks for a new width of the div. #map should always be displayed, the wrapper #mapWrapper should be have display set to none.

Include Google Maps 3 via Google Loader

Changing the version from 2 to 3 is not enough to get Google Loader to work correctly. You’ll get some error about how the sensor parameter is not set. To set this parameter use the follow code:

<script type="text/javascript" src="http://www.google.com/jsapi?key=YOUR_API_KEY_HERE"></script>
<script type="text/javascript">
google.load("maps", "3", {other_params:"sensor=false"});
</script>

Sphinx Configuration Inheritance

Sphinx Configuration files support inheritance between data of the same type eg source, index.
This can be a huge time saver when rolling code live or when supporting multiple indexes.
Take this configuration for example, as you can see the database information is stored multiple times.

source product {
    type                    = mysql
    sql_user                = logan
    sql_host                = localhost
    sql_pass                = password
    sql_db                  = development
    sql_port                = 3306
    
    sql_query               = \
                              SELECT name, retail_location, retail_name, quantity \
                              FROM product_sphinx
    sql_ranged_throttle	    = 0
    sql_query_info		    = SELECT * FROM product_sphinx WHERE id=$id    
    sql_attr_float          = latitude
    sql_attr_float          = longitude
}

source retail {
    type                    = mysql
    sql_user                = logan
    sql_host                = localhost
    sql_pass                = password
    sql_db                  = development
    sql_port                = 3306
    
    sql_query               = \
                              SELECT id, name \
                              FROM retail_sphinx
    sql_ranged_throttle	    = 0
    sql_query_info		    = SELECT * FROM retail_sphinx WHERE id=$id    
    sql_attr_multi          = uint products from query; \
                                   select retail.id, product.id,  FROM retail_outlet retail \
                                    JOIN retail_outlet_product product ON \
                                    product.retail_outlet.id = retail.id 
}

Inheritance will ease updates and make your configuration file readable.

The two newly created sources site_live and site_dev allow for easy changing of which database credentials. “:” is the inheritance operator, so retail inherits from site_dev. To change sphinx from reading the live configuration file simple replace “retail : site_dev” to “retail : site_live”. Also any database credential changes are now made in one place.