AOLserver 4.5.0 released June 27, 2006

The long-awaited next version of AOLserver 4.5.0 has been released today! To get an idea of what’s changed, check out the release notes or for the geekier version, the ChangeLog. To download the latest source tarball, click here: aolserver-4.5.0-src.tar.gz.

Here are the various places the announcement was posted:

Tags:
,
,

Storing binary data in MySQL with AOLserver

This week at work, Michael Andrews asked how to round-trip binary data into MySQL from AOLserver. Since AOLserver’s ns_db interface is still defined using Tcl_CreateCommand which isn’t Tcl_Obj’ified, it’s sensitive to embedded NULL bytes. In other words, values (such as SQL strings) with embedded NULLs (which can appear in binary data) will result in the value being truncated at the first NULL. So, we have to escape them, and luckily MySQL understand this.

Here’s a brief walk-through at the control port demonstrating AOLserver’s ability to safely round-trip binary data with embedded NULLs to and from MySQL. I’ve left out a lot of prerequisites such as installing MySQL, configuring a database pool in AOLserver, configuring and connecting to the control port, etc. Those kind of things should be documented elsewhere, already.

First, we get a database handle:

nscp 1> set db [ns_db gethandle mysqldb]
nsdb0

For the sake of this demonstration, lets show what version of MySQL I’m testing against:

nscp 2> ns_db select $db "show variables like 'version'"
t0

nscp 3> ns_db getrow $db t0
1

nscp 4> ns_set array t0
Variable_name version Value 4.1.11-Debian_4sarge2-log

nscp 5> ns_set free t0

Okay, so I’m using MySQL 4.1.11 as shipped with Debian Linux, which is what I’m using. YMMV if you’re using a different version of MySQL, of course. Next, lets set up our test data. We’ll create a test value with all possible 8-bit bytes from 0 through 255, repeated once. The resulting value should be 512 bytes long:

nscp 6> set value ""

nscp 7> for {set i 0} {$i <= 255} {incr i} { append value [format "%c" $i] }

nscp 8> append value $value; return

nscp 9> string length $value
512

That little “; return” on line 9 is there to prevent the control port from returning the result of the “append value $value” which would have new returned the new value, or 512 bytes of binary data, back to our terminal, which we really don’t care to see. I’ll use that little trick throughout this tutorial where it makes sense to squelch a noisy command’s output.

At this point, we’ve got a 512-byte long binary value in $value, with two NULLs in it, one at the very beginning, and one 257 bytes into it. Now, in order to send this through to MySQL, we’ll need to escape several bytes: the NULL (also expressed as “\0“), the single-quote or “'“, the Control-Z (byte 26 or 0x1A in hex representation) and the backspace itself. We have to escape the single quote because it will prematurely end the value we’re trying to pass in the SQL statement. This is how SQL injection attacks work, by taking advantage of sloppy programmers who don’t escape these single quotes when using dynamic SQL. Database APIs that use “prepared statements” and “bind variables” generally aren’t as susceptible to these injection attacks (but they still can be, if done improperly), but that’s a topic for another discussion. In the meantime, Tom Kyte wrote a great article in Oracle Magazine January/February 2005 on this.

Next, we’ll define two helper procs that will escape and unescape values suitable for using in our dynamic SQL statements. I’ve named the procs mysql.quotevalue and mysql.unquotevalue. They both use Tcl’s string map to perform the transformation. To quote a NULL byte we replace it with the string “\0” and to quote a single-quote we prepend a backslash so it gets replaced with the string “\'“. We do the same operation in reverse to unquote the values:

nscp 10> proc mysql.quotevalue {value} { string map {\0 \\0 ' \\' \x1A \\Z \\ \\\\} $value }

nscp 11> proc mysql.unquotevalue {value} { string map {\\0 \0 \\' ' \\Z \x1A \\\\ \\} $value }

Now that we have these two procs defined, we can create a temporary table for our test and start INSERT‘ing the 512-byte long binary value into it. I’ve named the table binarytest with just one BLOB column named value:

nscp 12> ns_db dml $db "CREATE TABLE binarytest (value BLOB)"

nscp 13> ns_db dml $db "INSERT INTO binarytest (value) VALUES (_utf8'[mysql.quotevalue $value]')"

Because Tcl stores its values inside of Tcl_Obj’s as UTF-8 encoded data but we’re pushing it naively through the ns_db interface, MySQL will unfortunately be receiving UTF-8 encoded data from us. So, we need to tell it that, in case it needs to do any behind-the-scenes transcoding based on the server or database’s default charsets. In MySQL, if you prefix a value with underscore followed by a character set name, it makes note that the value is encoded in that character set, which is exactly what you see above.

Now, lets retrieve the data back out, using the MySQL function QUOTE() to ask the DB to quote the response before returning it to us, so embedded NULLs in the response don’t cause premature truncation:

nscp 18> ns_db select $db "SELECT QUOTE(value) FROM binarytest"

nscp 19> ns_db getrow $db t0
1

nscp 20> set result [ns_set get t0 "QUOTE(value)"]; return

nscp 21> ns_set free t0

Now, if you read the documentation for QUOTE(), you’d notice that MySQL will surround the value with single-quotes unless the value is a NULL database value. Since Tcl has no explicit way of expressing a NULL value, we’ll reduce it to an empty string — while I don’t necessarily think this is the right thing to do, this is also a topic for another time. So, lets go to work:

nscp 22> set newvalue ""

nscp 23> if {[regexp {^'(.*)'} $result -> newvalue]} { set newvalue [mysql.unquotevalue $newvalue] }; return

So, now we should have our original data in $value and our newly retrieved and unescaped value in $newvalue. Lets ask Tcl to compare the two and tell us if they are indeed equal or not:

nscp 24> string length $newvalue
512

nscp 24> string equal $value $newvalue
1

Yay! We got back exactly what we put in. Perfect!

Some caveats with this approach include the fact that we end up storing our binary data encoded as UTF-8 in the database, not as actual binary. If you plan to connect to this database with another application other than AOLserver through the ns_db API, be very aware of this. Other applications will need to transcode the data from UTF-8 back to plain binary, and will need to transcode to UTF-8 before inserting/updating the values in the database so that AOLserver can pull them out correctly. Also, due to the storage in UTF-8, this also means that the storage requirement in the database will be close to two times the actual binary data being stored, so be aware of this when doing your capacity planning.

I hope this little walk-through tutorial has been helpful to you, or at least has given you some new things to think about with respect to how to use AOLserver, Tcl and MySQL. If you have any questions, don’t hesitate to leave a comment below or email me privately.

The AOLserver Wiki is now MediaWiki-powered!

As much as I hate to say it, I’ve finally caved in: tonight, I switched the AOLserver Wiki from the Tcl and MetaKit based WiKit over to the latest MediaWiki 1.5.0 which is implemented in PHP and uses MySQL. It still runs under AOLserver (yes, you can run PHP applications under AOLserver), but it would have been nice to continue to run wiki software that’s written in Tcl — however, it’s hard to pass up on all the nice features built into MediaWiki, and implementing them all in WiKit is just too much work for me at this point.

To let everyone know about this change, I sent this message to the AOLserver mailing list. Lets see what folks think about the change!

tcl-coredumper mentioned on code.google.com!

Yay! Chris DiBona made tcl-coredumper the featured project on code.google.com today!

tcl-coredumper 0.1 released

I recently blogged about Google releasing some of their code as open source projects. What I didn’t explicitly say in that previous entry was how cool it would be if there were a Tcl interface to the Google coredumper library. Well, now there is!

Thanks to Nathan Folkman who had the same great idea, he started working on what is now being called tcl-coredumper. I assisted by doing the autoconf stuff, hacking on the Makefile, and writing the automated tests for it, as well as working on the code itself.

See the official announcement on the AOLSERVER-ANNOUNCE mailing list that was also crossposted to the AOLSERVER mailing list. It includes a link to download the source: tcl-coredumper-0.1-src.tar.gz.

Google Code: google-coredumper, google-sparsehash, google-goopy, google-perftools

Chris DiBona, previously an editor at Slashdot, now the Open Source Program Manager at Google, announced today that Google has launched its Google Code site, where it has placed some of its contributions back into the Open Source community at SourceForge!

The initial list consists of four projects:

  • google-coredumper: CoreDumper — “The coredumper library can be compiled into applications to create core dumps of the running program, without termination. It supports both single- and multi-threaded core dumps, even if the kernel doesn’t natively support for multi-threaded core files.”
  • google-sparsehash Sparse Hashtable — “This project contains several hash-map implementations in use at Google, similar in API to SGI’s hash_map class, but with different performance characteristics, including an implementation that optimizes for space and one that optimizes for speed.”
  • google-goopy: Goopy/Functional — “Goopy Functional is a python library that brings functional programming aspects to python.”
  • google-perftools: Perftools — “These tools are for use by developers so that they can create more robust applications. Especially of use to those developing multi-threaded applications in C++ with templates. Includes TCMalloc, heap-checker, heap-profiler and cpu-profiler.”

Three of the four are of interest to me: coredumper, sparsehash, and perftools.

For a long time, I’ve wanted better coredump capability in Linux, especially for multi-threaded applications such as AOLserver. Google’s contribution could “solve” that problem for me, which would be fantastic. Right now, it’s very difficult to troubleshoot a multi-threaded application on Linux because of this lack of capability, and gdb’s “gcore” just doesn’t cut it. Perhaps the Linux and GDB teams can integrate Google’s contribution back into their respective codebases; we’ll see.

Google’s sparse hashtable implementation could yield some performance improvement to Tcl which makes extensive use of hashtables. I’d like to see if I can use the Google sparsehash implementation as a drop-in replacement for the Tcl implementation and see what the benchmarks say. This could be big.

Google’s perftools is somewhat of a misnomer, since the big selling point is their improved memory allocator which is supposedly “[the] fastest malloc we’ve seen[, and] works particularly well with threads and STL.” This could displace the Tcl threaded memory allocator, if performance really is superior, or could be used by the Tcl threaded memory allocator for an additional performance boost. It should be fun experimenting and benchmarking it.

It’s nice to see Google publish some really valuable stuff back to the Open Source community instead of just lamely throwing us a bone like IBM. This is definitely consistent with Google’s “do no evil” philosophy.

Man, this is just awesome. It gives me a whole new range of toys to play with. It’s like Christmas in March!

Boston-area OpenACS users meeting, Friday March 11, 2005

Andrew Grumet blogs about today’s Boston-area OpenACS users meeting at the MIT Sloan School of Management. Wish I could have been there; maybe I’ll be able to attend the next one, if it’s at the same place next time.

another nsopenssl-related memory leak fixed!

Yesterday, I sat down and finally reviewed the patch that Christopher Bowman provided to fix the SSL-related leak and came up with a reproducible way of demonstrating the leak and verifying the fix (see SF Bug #1160850: nsopenssl leaks SSL obj. mem w/Keep-Alive). It turns out the leak was due to AOLserver not calling the socket driver’s DriverClose callback when a HTTP Keep-Alive connection timed out, which is where nsopenssl does its clean-up of the SSL structure that’s allocated for the connection. So, the fix was in aolserver/nsd/driver.c and not in the nsopenssl module itself. That means the fix will be generally available as part of the next release of AOLserver which will be 4.0.11.

Until 4.0.11 is released, however, if you have a server that is using nsopenssl and you’re seeing this memory leak, you can “work around” the problem by disabling HTTP Keep-Alive, which is unfortunately a server-wide config. option instead of a per-socket driver option. You do this by setting the keepalivetimeout parameter to 0 (zero) in the ns/parameters section, like this:

ns_section  "ns/parameters"
ns_param    keepalivetimeout    0

Again, thanks to everyone who has helped investigate, troubleshoot, provide information about and otherwise participated in the hunt for this leak.