I’ve been tweeting quite a bit about MongoDB over the past few weeks and its time to blog.

About 2 months ago we decided to install and mess around with MongoDB. We went from messing around to serious adoption about 2 weeks ago when we realized the power of working with it and PHP. It was Mitch Pirtle of www.spacemonkeylabs.com that first pointed us in this direction. Mitch was very enthusiastic about Mongo as he explained the great potential. Yet, it was only until I added an array of data from PHP into Mongo that my eyes started to open up.

Go ahead, open another tab in your browser (KEEPING THIS ONE OPEN) and Google MongoDB if you haven’t already (I made it easy providing the link). You’ll find sufficient documentation to give you the warm and fuzzy that 10gen isn’t messing around. They are in the database game for the long haul.

During our play with MongoDB we realized how true the term NoSQL is. For all you SQL grease monkeys out there it’s liberating. No Joins, No Select, just plain NoSQL.

As an example, let’s start off with putting uploaded files “directly” into MongoDB via GridFS (the storage specification for handling large files). Here is how simple it is:

First, we are going to assume you already have the code in place to handle the upload itself. We first tested this using SWFUpload which provides quite a bit of flexibility if you like to control how your page looks during upload. From a PHP perspective, the uploaded file will be accessible via the predefined variable $_FILES.

Here is what you have to do to get the upload into MongoDB:

<?php

$m = new Mongo(); //connect
$db = $m->selectDB("example"); //select Mongo Database

$grid = $db->getGridFS(); //use GridFS class for handling files

$name = $_FILES['Filedata']['name']; //Optional - capture the name of the uploaded file
$id = $grid->storeUpload('Filedata',$name) //load file into MongoDB

?>

Yep, that was it. One thing that took me a “second” to realize is that you actually pass the literal name of the file_post_name (in the example ‘Filedata’) to MongoDB. It does all the heavy lifting of getting the data from the system and storing it. Also, take note that you get an $id back which is the MongoID that acts as the “primary key”. That makes it easy for you to reference the file right away if you need to.

So that’s a rather quick look and tip for MongoDB. Stay tuned  as we continue to put out our MongoDB tricks and tips for PHP. We’re in it for the long haul too.

Where did January go?

Where did 2009 go?

Our last blog post was in December?

Wow, time is flying by. The good news is we’ve been quite busy here at LightCube. The bad news is we are breaking the rule of blogging regularly (sorry).

Looking back, the past two years has brought its challenges and lessons. But we are now looking forward and gearing up for what’s in store. Speaking of which, where is what we’ve been up to:

Tienda! – Who says the internet doesn’t need another eCommerce tool? If you check with the Joomla community you’ll quickly see there is a massive need. We’ve been working with Dioscouri to bring the first native eCommerce application to Joomla! The really exciting news is that Tienda v0.2.0 (Alpha) is available.

MongoDB – For a SQL Grease Monkey like myself this is a huge redirection. Don’t worry, we’re not abandoning our traditional LAMP stack just yet. It’s not so much about leaving MySQL as much as it is about being liberated to use something else. We’re already working on a few pet projects to use MongoDB for our core ACL and automatic MediaRSS feed generating.

WebEDI – What is EDI you ask? It’s a LONG story. But I can tell you this much,  it has been around before the internet and isn’t going away anytime soon. We are helping to enhance the EDI back office systems of companies that are warming up to web services and XML.

I did say a quick summary didn’t I? Although tomorrow isn’t promised to anyone this year is going to be a busy one. We’ll keep you posted (hopefully in increments shorter than a month).

Live from NYC!

It was a wonderful day at the Joomla! Developer Conference.  Keeping to my usual style I took notes in MindMap. I’m sharing as a PDF in case you would like to click some of the links or copy some of the text. Feel free to share any comments.

Joomla Dev ConferenceNYC

A big thanks to all who organized and presented. We at LCS really enjoyed it and look forward to the next few months as Joomla! 1.6 becomes solidified.

Knock, knock. Let me in!

Port-knocking has been around for a while, so many of you may already be familiar with the idea. But for those who aren’t, I’ll take a minute to briefly describe it. The concept is that a service (daemon in the Linux world) listens on the link layer of your network interface for “knocks”, or small packets hitting a combination of ports that you have predefined. The ports don’t have to be open since all of the magic here happens at a lower level. Once the combination of ports have been “knocked”, the service responds by doing what you configured it to do, run some command on your system.

One of the most common uses involves opening and closing port 22 in your firewall for remote shell access. As to why someone might want to do this, try opening up port 22 on a public machine and then watch your logs fill up over the next few days with script-kiddies attempting to brute force their way into your system. If you have one user with a common username and weak password, it’s only a matter of time before someone breaks in.

To avoid this, you can set up a port-knocking server, like the one at zeroflux.org. The page there gives you configuration examples for exactly the scenario I related above. It even offers a mechanism to somewhat ‘randomize’ the ports being knocked so that someone listening on the wire can’t knock the same ports with the same results.

How do you use port-knocking?

Because the knock daemon can be configured to run any command upon the right sequence of ports, the possibilities for its use are quite wide. My question to you is, apart from the above example (opening up port 22 for shell access) in what other scenarios would you find port-knocking useful? Do you currently use it for automating other tasks? Comments welcome.

One afternoon, not so long ago, I received a phone call from one of our clients asking LightCube to investigate why a web application hosted on an internal Linux server was so unresponsive. After a little bit of poking around it became apparent what was happening: someone had managed to break into the system and create a rogue account for themselves and was using this account to continually attack other machines! How had this intruder gained access? One word: VNC.

Before I explain further how this happened, let’s step back for a second. Our client is a fairly large company, with skilled IT professionals managing their network infrastructure and services, mostly hailing from the Windows world. When they set about developing an internal web application, however, the low cost of Linux and Open Source was too attractive to ignore. So they grabbed a distro, set it up on a machine and got to work. Coming from a Windows world, the technicians incorrectly (but perhaps understandably) expected an item labeled “Remote Administration” would configure a service that behave like Windows Remote Desktop Connection. Instead, what they configured was a very insecure VNC service on a publicly available machine.

(As a sidenote, to me this well illustrates a very important point. The known stability, reliability and low-to-nil licensing cost of Open Source software means that a lot of people are looking to use it, and these days, basic services can be implemented fairly easily. However, getting secure, reliable, optimized use out of your Open Source still requires someone who knows what they’re doing.)

Back to the story, here’s what happened: One of their administrators logged in remotely to the machine through the VNC connection. As root. (That’s the first mistake, but I won’t really address that too much here. Keep in mind they’re coming from Windows, eh?) Then, when the administrator was done doing what he was doing, he simply closed the VNC window. In the Windows world, that wouldn’t be much of a problem. When connecting again, the Windows server would require that you authenticate. With VNC, not so much. Unless you log out of the remote system, whoever next comes and tries a VNC connection on the default ’0′ session – they get whatever you left open. If you were logged in as root, as was the case here, a full root desktop is what you get. “Come right inside, make yourself at home! Here’s the keys, change anything you like.”

Don’t misunderstand me. This isn’t a case of “Windows has better security than Linux”. I think someone would have a hard time arguing that point. This is a case of someone enabling an insecure protocol on a Linux system without really investigating how it works. To be fair, this particular distro did make it seem like this was a pretty standard way of remotely administering the machine. A little note from the distro about VNC being unencrypted and using poor session handling methods would have been more helpful, though.

We closed up the security holes on their system and ran a full audit. Fortunately, the damage was minimal. Afterwards, we needed to find an alternative for remote desktop management. What we found was NoMachine NX. All the communication takes place over an encrypted SSH connection, so it is secure (well, as secure as your password or public key, but that’s another article). But it’s also fast. NoMachine has taken a different approach to data transmission, such that it outperforms VNC any day. The server currently only runs on Linux or Solaris, but they have clients for all major desktops. If you absolutely must have a GUI running on your remote Linux server, I highly recommend NoMachine NX as a better way to achieve it.

People use Eclipse. People use Windows. People use these tools together to develop code. I am not (typically) one of these people. My development environment is usually either Coda on the Mac or vim on the command line. But I do work with people using Eclipse on Windows, and while the code we’re building together is not platform specific, it does help if we all have the same capabilities. So when I added some functionality to our Apache ant deploy script to synchronize files on a remote server using rsync, the next step was, of course, getting it to work on Windows.

Allow me to walk through the configuration, step-by-step. The first thing was to create a target in my build.xml file which ant can use to synchronize files to the remote machine. Something like this:

<target name="rsync_remotehost" >
   <exec executable="rsync" dir="${cfg.someDir}">
        <arg line="-aOvz --chmod=g+w,Da+rX,Fa+r,F-X --exclude .svn . \
            ${rsync.user}@${rsync.server}:${rsync.dir}" />
    </exec>
</target>

So really, all that does is define a target which runs rsync inside a specified directory. The important aspects to note are the rsync parameters. For example this: -aOvz These are pretty standard options (a for archive, v for verbose, z does compression), except for the O. I wasn’t used to using this option, but it’s important for a very subtle reason. To explain, first a little background. Typically when I want to synchronize the entire contents of a directory with rsync, I would do this:

rsync -avz path/to/somedir user@remote.host:/some/remote/dir

But this doesn’t quite work in the Eclipse+Ant environment on Windows. The reason is because the variables holding the directory locations get represented in Windows path notation. (Imagine a C: at the beginning and all the directory slashes are backwards.) Rsync can’t handle this. It’s a Unix-based tool. It’s expecting a Unix-style path. So to get around this we set a dir attribute in the exec element which causes ant to change to that directory before execution. We also use a ‘.’ in the rsync argument line to specify that the source contents are this directory. This has an odd side effect: rsync attempts to set times on the corresponding directory at the remote location, which tends to fail. So we use the -O option to tell rsync not to set times on directories.

The other arguments given to rsync are for specifying more sane permissions on the remote files since Windows wanted to kill all permissions for group and world by default (--chmod=g+w,Da+rX,Fa+r,F-X) and for excluding our local Subversion files (--exclude .svn).

So that’s the ant target. Now to the interesting parts. ;-) First we have to get rsync installed on Windows. Cygwin makes this relatively easy. Download and install. Most options are pretty much good at the defaults, except watch out for this screen:

Selecting packages for cygwin

On this screen you need to add a couple of packages, both in the Net section: Net -> rsync and Net -> openssh. Just click once on each of those packages and it will add what you need. Finish up the installation by clicking on ‘Next’ and ‘Finish’.

Once Cygwin and Rsync is installed, there’s still a few more things we need to do in order to get it all to work together correctly. First, you need to add the path to Cygwin’s binaries to the Windows path so that system calls in Eclipse will find the Cygwin binaries. To do that, right-click on ‘My Computer’ and click ‘Properties’. Then click on the ‘Advanced’ tab and then the ‘Environment Variables’ button:

Environment Variables

Next find the section on the bottom called ‘System variables’ and scroll down and double-click on the ‘Path’ line:

Path variable

Insert the following at the end of the ‘Variable value’ line:

;%SystemDrive%\cygwin\bin

Lastly, you need to set up a public key so that when ant runs, it won’t hang waiting for you to input a password (which you can’t do inside Eclipse anyway). To do that, open up Cygwin – you should have a shortcut on your desktop. When it opens run:

ssh-keygen

Accept the defaults, don’t add a passphrase, just hit enter instead. When the command is done, it will have saved a public and private key for you. We want to upload those to our remote server, like so (you’ll be asked for your password, and possibly to accept the key for the remote host – type ‘yes’ to do so):

scp ~/.ssh/id_rsa.pub user@remote.host:~/

Next we need to place it somewhere special on the remote host (you’ll be asked for your remote password again):

ssh user@remote.host
install -dv ~/.ssh
chmod 0700 ~/.ssh
cat id_rsa.pub >> ~/.ssh/authorized_keys
rm id_rsa.pub
exit

Now, test the connection to make sure that you can connect without giving a password:

ssh user@remote.host

If all is good just exit and that should be it! The next time you open up Eclipse, it should be able to call rsync and run successfully through your Ant target!

We at LCS attended the first Joomla Day NYC event!

There were quite a few nifty things I picked up from the event that I have to investigate and incorporate into my bag of tricks. The first and major investigation point looking back is MongoDB which was brought up by Mitch Pirtle in “Expert- Extension Development” session. The other cool points you’ll have to pick up via the Mind Map that I’ve attached. Yes, its messy and “organic” but that’s what a Mind Map is all about.

Joomla Day NYC Mind Map

Since its inception, LightCube Solutions has run on a custom-built Linux machine. Being a former LFS developer, I hail from the Linux world of ‘Do It Yourself’, and so I prefer to use self-configured servers, tuned and set exactly the way I like. This is no Fedora or Ubuntu where a host of unnecessary packages are forced on you and custom configuration files mask the generic and standard configuration files that come with the original software. This is ultimate flexibility.

But that flexibility does come at a price. Maintaining an LFS system can become a chore. Installing a new package always means compiling from source. Staying on top of security updates is entirely left to you. The system is only as good as your personal understanding of its internals. A balance somewhere in between would be ideal:

  1. A lightweight system that is known to be stable and secure.
  2. The possibility of complete configuration is given to the end user.
  3. The focus of the system is tight, and therefore higher quality (in terms of stability, functionality and reliability) can be achieved.
  4. All the while the system benefits from security updates and testing derived from a community of users and developers.

And so, having realized that I needed to move beyond my personal build scripts and start packaging the system (at the least, for my own sanity) it was decided to create a distribution based on our own needs for Linux-based web services. Voila! LightCube OS is born. The basic outline of the distro’s goals are this:

  • Provide a lightweight, fast, stable and secure LAMP application server.
  • As close as possible, adhere to the GNU principles of free software in the packaging and distribution of the system.
  • As nearly as possible, provide a ‘vanilla’ system. In other words, don’t create obscure custom configuration schemes. Allow as much manual configuration by the end users as possible.
  • Focus on packaging software that is reasonably used with production LAMP servers. (E.g., there’s no reason to build an X desktop environment for a server housed remotely and accessed mainly through ssh. Make the system geared towards advanced command line users. As much as a good GUI is nice, there’s no reason for a remote server to run one locally.)
  • Make the base system streamlined, optimized and small. While it is realistic to package a few variations of software (E.g., nano vs. vim, Exim vs. Postfix), the core system should focus on one basic set of core packages.

These are the main ideas behind LightCube OS. The build scripts and the core package specs are already under development. And the distro’s project site/infrastructure has been put in place: http://www.lightcubeos.org. Volunteers are welcome to join in the development.

In the meantime, what are your thoughts concerning the above? What advantages/disadvantages do you see to such a distribution? Do you have any comments or suggestions that will help improve its appeal or usability? I welcome your comments…

If you are an Internet Explorer user, please, please, please, update your browser.

On a recent project, I spent considerable time trying to get the layout and controls of the site to work when viewed through Internet Explorer 6. Why did I feel this was important? Statistics show that a major portion of the web users are viewing online content using IE. Estimates run anywhere from 60% to 85% of users. Statistics vary, because no one group or organization can possibly gather data for all users, however, taking many popular sites and averaging their statistics can give you a pretty fair idea. Of the 60-85% of users that view the web with IE, a large portion still seem to be using IE version 6. This is a bad thing… a very bad thing. Why?

Simply put, IE 6 is not standards conforming. The W3C has set forth standards for the underlying code on web pages. This helps to ensure that content and data can be universally rendered and understood. Although IE6 does largely conform to the standards of HTML and CSS, very often it simply ignores rules and therefore nasty flaws in its rendering engine are revealed when viewing certain combinations of standards-compliant code. For examples, see here: http://www.positioniseverything.net/explorer.html

Tips and tricks exist to avoid most of the bugs, but seriously, should web developers be expected to side-step standards to support users that continue to use buggy software that is 8 years old (released in 2001) and is now two versions behind the latest? More importantly, money is regularly wasted in the time spent to ensure that a site or front-end to a web application can be properly viewed in IE6. If you are a web developer, how much time do you spend ensuring that IE 6 users can view your site? You might be surprised by the answer.

Internet Explorer is now up to version 8. Thankfully, this version finally takes the approach of adhering closely to the W3C’s standards. If you are an IE user, (and this post was not intended to spark that debate), please, make everyone’s life easier and update your browser.

I was looking around for concepts in building a reasonably secure HTML login form without using SSL, and I came across an interesting article (link at end of post). The concept it outlines is fairly simple, and I’m a little annoyed that I didn’t think of this myself earlier.

Essentially, the idea is that the password never actually leaves the client machine. Instead, the client sends a cryptographic hash of the password. For other security reasons, we also don’t want the server to store the password in plain text, so it should only store the hash value of the password.

Of course, this alone isn’t enough, because anyone scanning the wire could simply capture the hash and send that along to the server and authenticate. What we need is a way for the server and the client to agree that they have the same hash value for the password, without actually sending it. To accomplish this, we can set up the server to generate a random string and send that to the client. Then, both server and client append the password’s hash to the random string and perform a hash sum on the combined string. The client then sends that string to the server and if it agrees with the result the server got, we have a valid authentication.

The article referenced above included some sample code to illustrate this functionality, but I believe I can simplify it even further. It’s not a practical, real world example, because we’re not sending a user name or retrieving a password from a stored location on the server. But it should be enough to illustrate the concept and give a developer a head start in however they wish to implement. Personally, I plan to instantiate the code in a class and use XMLHttpRequest instead of traditional POST methods.

Anyway, on to the example code. Note: This example doesn’t actually look up any stored user login information. Instead it simply uses a pre-defined password: ‘password’.

We’ll need two files. The first file generates the server’s shared key and passes along the value to the client as well as the HTML and JavaScript needed to input a password, generate hash values and submit the form to the server.

Create main.php with the content:

<?php
// We'll use PHP's session handling to keep track of the server-generated key

session_start();

// Function to generate a random key.
// Modified from code found at: http://www.totallyphp.co.uk/code/create_a_random_password.htm

function randomString($length) {
    $chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
    $str = NULL;
    $i = 0;
    while ($i < $length) {
        $num = rand(0, 61);
        $tmp = substr($chars, $num, 1);
        $str .= $tmp;
        $i++;
    }
    return $str;
}

// Call the function and set the shared key

$key = $_SESSION['key'] = randomString(20);
?>

<html>
  <head>
    <!-- JavaScript that contains the functions which perform the actual hashing -->
    <script type="text/javascript" src="http://pajhome.org.uk/crypt/md5/sha1.js"></script>

    <!-- The following function creates the hash of the concatenated key and password hash
and submits the content to the server via a form -->
    <script type="text/javascript">
	function login() {
		var p = hex_sha1(document.getElementById('pass').value);
		var k = document.getElementById('key').value;
		var h = hex_sha1(k+p);
		var hash = document.getElementById('hash');
		hash.value = h;
		var f = document.getElementById('finalform');
		f.submit();
	}
    </script>
  </head>
  <body>
    <form action="javascript:login()" method="post" >
	<input type="hidden" id="key" value="<?php echo $key; ?>" />
	<input type="password" id="pass" />
	<input type="submit" value="Submit" />
    </form>
    <form action="login.php" method="post" id="finalform">
	<input type="hidden" name="hash" id="hash" />
    </form>
  </body>
</html>

Next we need the file to handle the submitted values and compare the results. Create login.php with the following contents:

<?php
session_start();
$hash = $_POST['hash'];

$pass = sha1('password');
$key = $_SESSION['key'];

$server_hash = sha1($key.$pass);

if ($server_hash == $hash) {
	echo "MATCH!";
} else {
	echo "NO MATCH!";
}
?>

That’s pretty much it. If you want to see a little more fluid example in action, see: http://www.lightcubesolutions.com/~jhuntwork/secure_login/

Referenced article: PHP – Implementing Secure Login with PHP, JavaScript, and Sessions (without SSL)