From:

Polyglot programming

As you may have gathered from previous entries, I've recently become interested in programming languages again.
I'm almost done (though skimming some parts) with my copy of the aforementioned Rubybook. and though I haven't done any substantial Ruby programming (just playing around in jirbwhile reading), I think I have a good idea now why so many people love the language.


Closures


Most of my real code over the last number of years has been in C, Java, and Python, and I know those languages and their runtime libraries pretty well, but reading the Ruby book I was struck by how really useful closurescan be in an Algol-family language (i.e. not Lisp). Well, Ruby calls them "blocks" and has infrastructure on top in the form of yield, etc., but that's fundamentally what they are. C/Java/Python all lack them (no, Python's single-line lambdas are too restrictive to count).
Closures...environment, get it?



Closures are incredibly powerful, in fact you might say they're the ultimatelanguage construct. Neal Gafter has a good descriptionof the kinds of things you can do with them, going from a language that doesn't currently have them.


Polyglot programming


While I was thinking about this blog entry, I reread one of Steve Yegge's great blog posts, and decided to look up more about his reference to the author of a design pattern book "leaving Java to go to Ruby". After going to Martin's home page, I found on his wiki he has a good entrywhich is pretty close to what I wanted to talk about. That is basically: it makes sense for large software systems to have multiple layersin different languages.


Now, if you're thinking "Wow, that's obvious", that's good; but there is more to the story here. So let's look at some rationales. If your app is large enough, you probably have parts which need to be fast. And you probably have other parts which cry out for a domain-specific language.


So for speed, you'll want a lower layerwhich is usually characterized by manifest typing and direct vtablefunction dispatch. Read: C++/Java/C#.


But it makes sense often to have a higher layerwhich is agile. It is better for the parts of your program which change rapidly - this could be user interface bits you're prototyping, or rapidly creating test cases. This layer is usually characterized by implicit typing (possibly with type inference), metaprogrammingcapabilities, and (ideally) good integration with the lower level language. Read: Groovy,JavaScript,Python,Ruby


There is a lot of software out there split in exactly this way; in fact, you're almost certainly reading this blog entry in one of them, Firefox, where the answer is C++ and JavaScript. A lot of computer games are built in this way too - for Civilization 4, the answer is C++ and Python, and for World of Warcraft it's C++ and Lua. If you're familiar with Java, just think about JSP and Ant - they're really DSLs. If you mostly know Python or Ruby, think about how much of the underlying platform is actually written in C/Java/.NET.


So it's fairly easy to dismiss anyone who says something like "everything must be written in language X", for values of X like C,Ruby,Python,Java. Which reminds me to say: Eclipsereally needs to embrace Eclipse Monkey.


The impedance mismatch




So we accept that it makes sense to have multiple languages with different characteristics. One important issue then becomes - how similar are our two different layers? Taking the example of C++ and Python as in Civilization 4. The gap is enormous. C++ containers are not the same as Python containers. C++ strings are not the same as Python strings. C++ objects and Python objects are wildly different. The answer to this problem is to create a special glue layer; in Mozilla, it's called XPCOM. In GNOME, it's called pygobject. These layers are very painful to create and maintain.


An interesting question is - what if our two languages shared more? Do we really need to have separate container types just to get agility and dynamism? The answer turns out to be - no, which we'll get to in a minute. As we know, the fact that there are a lot of things that every modern language shares lower level components (like garbage collection, JIT compilation) led Microsoft to brand .NET as a multi-language runtime (as an aside, plenty of languages ran on the JVM long before .NET was created; for example Kawa, which dates to 1996). Now here's the thing, though. Running on .NET does not make Python objects same as .NET objects, nor does it make their containers the same.


What is an object?


Let's briefly take a look at what a Python object is. It's a fairly illustrative example of just how different languages can be. In Python, every object instance is by default a dictionary (hash table), with data stored in the __dict__member. Every property lookup or method call has to in general traverse a chain of hash table lookups. At any point, some other code can come along and add a new entry in an object's dictionary:


#!/usr/bin/python

class Test(object):
  def __init__(self, a):
    self.a = a

t = Test("hello")
t.b = 42
print t.b 

Supporting this level of dynamism is expensive, both in time and space, again because every object instance carries along a mutable hash table under the covers. It means you can't share very much between processes. It makes multi-threading much slower because everything has to be synchronized on that dictionary. Besides being expensive, it's almost never what you actually want, at least by default. You usually want t.b to be an error. This is by far my biggest issue with Python. In fairness to Python, it predates almost every other language discussed here.


Stealing and language evolution


Languages are clearly stealing things from each other, and evolving together. In the Ruby book they often mention how certain parts were taken from other languages. Java and C# are stealing ideas from each other. ECMAScript 4 is clearly rebuilding itself on a more JVM/.NET like class model.



What I've been looking at lately is a new dynamic language that has clearly stolen a lot of the good ideas from Ruby and Python, but is a lot more "native" to a modern runtime (in this case, the JVM): Groovy.


#!/usr/bin/env groovy

class Test {
  String a
}

def t = new Test(a: "hello")
t.b = 42
println b 

This results in an exception about a missing b, because its idea of a classis exactly the same as the underlying JVM platform, where objects are much more static by default (this is also true of .NET). Note we can even declare types if we like (or we can just use def). I reallylike how default constructors work - it's even less typing than both Python and Ruby! It has useful closures, regular expression and hash table literals. Pretty cool. For a more complex example, here's an exampleof a fairly typical scripting task of log file processing I wrote a few days ago. I'm fairly sold so far, but there is still more to learn. I spent a bit of spare time poking at getting it packaged, but ran into some Maven bootstrapping issues.


More on languages


One random link: An awesome feature of Python is Generators, and if you aren't familiar with them and think of yourself as a "systems programmer", check out this very good slide set.


Second to last: some recent additions to my Google Reader feed: Charles Oliver Nutter, John Rose, Lambda the Ultimate.


As an aside that's not directly language related, but also new to my feed list is Why, who is like a great artist-programmer churning out amazing works like Shoes. Does anyone else have the feeling that for Why all of these code projects are just what he does in his idle time, and in the next few years he'll emerge from his underground hideout with an army of giant robots and take over the earth?

Related Articles

Relatd Projects

Apache Xindice

Apache Xindice is a database designed from the ground up to store XML data or what is more commonly referred to as a native XML database. The name is pronounced zeen-dee-chay in your best faux Italian accent. Don't worry if you get it wrong though, we won't mind. We just care that you spell it correctly. You might be wondering what a native XML database is good for? Well it pretty much has one purpose, storing XML data. If you don't have any XML data, don't want any XML data or think XML is the most over-hyped technology of the new millennium, then Xindice is not for you. We're not out to change the way data in general is stored, only to provide a good solution for storing XML data. If you survey your projects and see XML popping out of every corner, then Xindice might be a real help for storing that XML. The benefit of a native solution is that you don't have to worry about mapping your XML to some other data structure. You just insert the data as XML and retrieve it as XML. You also gain a lot of flexibility through the semi-structured nature of XML and the schema independent model used by Xindice. This is especially valuable when you have very complex XML structures that would be difficult or impossible to map to a more structured database.
jtris
The project name already says it. :) A game like good old Tetris. (Hey! By the way: I will call it 'Twintris' for the rest of my life, like it was called on the Amiga. But most of you just know 'Tetris', so I've decided to be more popular.)
nounit
How Good are your JUnit Tests? NoUnit measures your Junit tests in your project using Java, XML and XSLT. NoUnit gives a picture of your code , like a graph gives you a picture of a set of figures, to allow you to see what is *really* going on.
Maverick

Maverick is a Model-View-Controller (aka "Model 2") framework for web publishing using Java and J2EE. It is a minimalist framework which focuses solely on MVC logic, allowing you to generate presentation using a variety of templating and transformation technologies.

In principle it combines the best features of Struts, WebWork, and Cocoon2, however:

  • Maverick is simple to use - this is a minimalist framework that anyone can understand easily. This is not a "kitchen sink" framework that tries to provide everything you need to build a web application; there are plenty of great database connection pools, application servers, validation frameworks, templating languages, etc already out there.

  • Maverick is simple to understand - the code is easy to understand, there's not a lot of it, and it's designed with pluggability and extendability in mind. The idea of a Controller that builds a Model that gets rendered by a View is very simple and straightforward, so the framework should be too.

  • Maverick is agnostic about view technologies - you can use any templating engine you like with Maverick. Examples are provided for JSP (with JSTL - no need for special tag libaries), Velocity, and Domify/XSLT. The developers of Maverick actively use all three of these in their "real life" to build web applications.

  • You can run your view output through a pipeline of transformations. Maverick-supplied transformations include XSLT, DVSL, "wrapping" layout transformations, FOP, and Perl. You can efficiently chain many transformations of various types together, and you can specify this on a per-view basis. Of course, transformation technologies are pluggable and you can easily define your own.

    In addition you can halt the transformation process at any point and output the intermediate content. If you're using XSLT, this is a great way to produce static XML and build your templates offline with standard tools.

  • Your commands, controllers, views, and transforms are configured with an easy-to-understand XML sitemap. For even more flexibility, you can preprocess it with XSLT.

  • Maverick will automagically pick from different views based on user language, browser type, or any other characteristic of the request. Of course, this behavior is pluggable.

  • Maverick supports both Struts-style singleton Controllers (aka Actions) and Webwork-style "throwaway" Controllers.

  • Maverick is multi-platform; it has been ported to both .NETand PHP.

Depending on what templating technology you choose, you may be interested in one or more of the following features:

  • Maverick can automatically "domify" (or "saxify") arbitrary Java objects so that XSLT can be used without the effort and processing overhead of generating and parsing text XML. XSLT can be used as a templating language directly on your model just like JSP.

  • For text-based templating engines like JSP or Velocity, an elegant way to apply a common "look and feel" and layout to a set of views is to use the "wrapping" transformation. The output of the previous step is made available to subsequent steps as a String variable which can be placed anywhere on the page.

  • FOP transformations allow your application to produce PDF, Postscript, PCL, and a half-dozen other document formats on-the-fly.

  • An interesting alternative to XSLT is DVSL. This is a declarative templating language patterned after XSLT but based on Velocity.

If you like Maverick, but also would like to use additional features like Webwork and Struts provide, you might want to check out Baritus. Baritus is an extension of Maverick that provides a boosted version of the FormBeanUser controller. It focusses on fine grained population, validation and error reporting, has several utilities for things like formatting output and supports the concept of interceptors.

openrj
Platform-independent C-API library implementing the Record-JAR file format, with mappings to other languages/technologies, including C++, COM, D, .NET, Python, Ruby, and STL.
Swingweb

Swingwebis a web-application framework that enables AWT/Swing application to operate inside a web container and presented as a web application to the web browser, purely in HTML/CSS/javascript. The swing application will render and behave in the web-container the same as it would as a desktop application. There is little knowledge required for the developer to start developing swingweb application as long as they are familiar with swing UI development.

The main goals of the Swingweb framework are to:

  1. Enable true component-based web-application development platform
    Swingweb allows the developer to specify the web-application in terms of UI components and their interactions. The functionality of the web-application is more formally definedas opposed to page-centric model where functionality is composed by gluing the webpages loosely together. Furthermore, the application functionality is more predictable and guaranteed to work as long as all UI components are unit-tested. There will be no more page-management and http session-state manipulation hassle. Once written, the component, either fine-grain like a text-box, or coarse-grain like calendar, will be reusableand save development costs over time.
  2. Use swing UI model for web application development
    Swing component model is one of the best and most-practiced UI modelfor java platform. There should be no learning curvefor most Java developer to pick up web-application development using swingweb. Swingweb tries to make distinction between desktop and web application development as transparentas it could be yet at the same time maintains flexibility on the controls of the look and feel of the web-application through UI component template system.
  3. Allow rapid development using existing GUI editors for web application development
    There are a lot of good tools for swing application development. Most java IDEs (including eclipse) also contain GUI editors. The editors allow the developer to rapidly develop, maintain and changethe application in response to changed requirement.
  4. Save development cost by merging the development process of swing desktop application and (e.g. in-house) web application
    Given the right architecture, the desktop and web application development can be merged into a single process and thus reduces development cost.

Some of the features of Swingweb include the following:

  • Support web-environment with multiple concurrent application sessions
    While most desktop applications are designed to run in a stand-alone jvm, swingweb extended the AWT toolkit implementation so that multiple concurrent application sessionscan run in the web container jvm, yet making each application session isolated so that the running applications will not interfere with each other's operations.
  • Component-oriented architecture that allows developers to enable existing swing component to web environment easily
    Swingweb underlying framework allows developer to control how the interaction and event-input of each component individually. Thus it is very easy to extendan existing swing component (e.g. some third-party widget) so that it can operate in web environment.
  • Flexible template-based approach allowing changing look/functionality of component/application
    The rendering of the each swing components is provided through a template file that is written in jxp template language(essentially java language). Thus it is very easy to control the look and feel of a component as web widget. Furthermore, the template can be configured per component type or per instance.
  • Transparent URL manipulation and session management inside swingweb application
    Swingweb support url manipulationso the developer can customize how the browser url changes as the interface changes. The application can also access and react to url changesusing very simple API. This allow implementation of webapplication that allows bookmarking, back-buttonand other advanced features such as component caching and sharing. There is also API for per session variables.
  • Light-weight, performance focus, and scalable for high traffic web-applications
    The (awt)toolkit implementation utilizes several techniques including lightweight peers, threadless event queue, custom repaint managerto reduce memory footprint and increase the performance of the swing application in the web context. There is also a special deployment mode, share-app mode, which allows sharing of component instancesacross the application sessions for scalability.
  • Natural adaptation of components to the webSome adaptation has been made to the components for the web development - e.g. - FileDialog will upload local files to webserver, all button icons will be buffered and rendered property as <img src=""> tag, all components that uses Graphics2D can be rendered nicely as image (with a flag set of the component) etc.
ezwatt-billing
This is a small personal project i have been working on for a while. It is for my FBLA competition in March. All it really is a billing system.
Apache Commons Jelly
Jelly is a tool for turning XML into executable code. So Jelly is a Java and XML based scripting and processing engine. Jelly can be used as a more flexible and powerful front end to Ant such as in the Maven project, as a testing framework such as JellyUnit, in an intergration or workflow system such as werkflow or as a page templating system inside engines like Cocoon. Jelly borrows many good ideas from both JSP custom tags, Velocity, Cocoon, Ant. Jelly can be used from the command line, inside Ant and Maven or inside a Servlet, Web Service, JMS MessageListener or embedded directly into your software. Jelly has native support for a Velocity-like expression language called Jexl which is a superset of the JSP, JSTL and JSF expression languages as well as support for other pluggable expression languages like XPath via Jaxen, JavaScript, beanshell and Jython. Jelly is completely extendable via custom tags in a similar way to JSP custom tags or Ant tasks. Though Jelly is really simple and has no dependencies either Servlets or JSP. So you could think of Jelly as similar to an XML-ized Velocity where the directives are XML tags. Or you could think of Jelly as a more flexible engine for processing Ant tasks with better expression, logic and looping support. Jelly is also based on an XML pipeline architecture, like Cocoon, so that it is ideal for processing XML, scripting web services, generating dynamic web content or being part of a content generation system such as Cocoon.
omseek
Omseek has been renamed to Xapian. Xapian is a Search Engine Library, written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C# and Ruby. It allows you to easily add advanced indexing and search facilities to your applications. See xapian.org
highlighter
This project was created to show a way to highlight keywords of any language in a simple way. The resource which has been used is ANLTR (www.antlr.org). You are free to adapt it for your language, you just have to define the keywords in the grammar file.